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PREFACE TO THE FOURTH EDITION 


This fourth edition contains several additions. The main ones con¬ 
cern three closely related topics: Brownian motion, functional limit 
distributions, and random walks. Besides the power and ingenuity of 
their methods and the depth and beauty of their results, their importance 
is fast growing in Analysis as well as in theoretical and applied Proba¬ 
bility. 

These additions increased the book to an unwieldy size and it had to 
be split into two volumes. 

About half of the first volume is devoted to an elementary introduc¬ 
tion, then to mathematical foundations and basic probability concepts 
and tools. The second half is devoted to a detailed study of Independ¬ 
ence which played and continues to play a central role both by itself and 
as a catalyst. 

The main additions consist of a section on convergence of probabilities 
on metric spaces and a chapter whose first section on domains of attrac¬ 
tion completes the study of the Central limit problem, while the second 
one is devoted to random walks. 

About a third of the second volume is devoted to conditioning and 
properties of sequences of various types of dependence. The other two 
thirds are devoted to random functions; the last Part on Elements of 
random analysis is more sophisticated. 

The main addition consists of a chapter on Brownian motion and limit 
distributions. 

It is strongly recommended that the reader begin with less involved 
portions. In particular, the starred ones ought to be left out until they 
are needed or unless the reader is especially interested in them, 

I take this opportunity to thank Mrs. Rubalcava for her beautiful 
typing of all the editions since the inception of the book, I also wish to 
thank the editors of Springer-Verlag, New York, for their patience and 
care. 

M 丄 

January^ 1977 
Berkeley、California 







PREFACE TO THE THIRD EDITION 


This book is intended as a text for graduate students and as a reference 
for workers in Probability and Statistics. The prerequisite is honest 
calculus. The material covered in Parts Two to Five inclusive requires 
about three to four semesters of graduate study. The introductory part 
may serve as a text for an undergraduate course in elementary prob¬ 
ability theory. 

The Foundations are presented in: 

the Introductory Part on the background of the concepts and prob¬ 
lems, treated without advanced mathematical tools; 

Part One on the Notions of Measure Theory that every probabilist 
and statistician requires; 

Part Two on General Concepts and Tools of Probability Theory. 

Random sequences whose general properties are given in the Founda¬ 
tions are studied in: 

Part Three on Independence devoted essentially to sums of inde¬ 
pendent random variables and their limit properties; 

Part Four on Dependence devoted to the operation of conditioning 
and limit properties of sums of dependent random variables. The 
last section introduces random functions of second order. 

Random functions and processes are discussed in: 

Part Five on Elements of random analysis devoted to the basic con¬ 
cepts of random analysis and to the martingale, decomposable ， 
and Markov types of random functions. 

Since the primary purpose of the book is didactic, methods are 
emphasized and the book is subdivided into: 

unstarred portions, independent of the remainder; starred portions, 
which are more involved or more abstract; 

complements and details, including illustrations and applications of 
the material in the text, which consist of propositions with fre- 








PREFACE TO THE THIRD EDITION 


quent hints; most of these propositions can be found in the 
articles and books referred to in the Bibliography. 

Also, for teaching and reference purposes, it has proved useful to name 
most of the results. 

Numerous historical remarks about results, methods, and the evolu¬ 
tion of various fields are an intrinsic part of the text. The purpose is 
purely didactic: to attract attention to the basic contributions while 
introducing the ideas explored. Books and memoirs of authors whose 
contributions are referred to and discussed are cited in the Bibliography, 
which parallels the text in that it is organized by parts and, within parts, 
by chapters. Thus the interested student can pursue his study in the 
original literature. 

This work owes much to the reactions of the students on whom it has 
been tried year after year. However, the book is definitely more concise 
than the lectures, and the reader will have to be armed permanently 
with patience, pen, and calculus. Besides, in mathematics, as in any 
form of poetry, the reader has to be a poet in posse. 

This third edition differs from the second (1960) in a number of 
places. Modifications vary all the way from a prefix (“sub” martingale 
in lieu of “semi’’-martingale) to an entire subsection (§36.2). To pre¬ 
serve pagination, some additions to the text proper (especially 9, p. 656) 
had to be put in the Complements and Details. It is hoped that more¬ 
over most of the errors have been eliminated and that readers will be 
kind enough to inform the author of those which remain. 

I take this opportunity to thank those whose comments and criticisms 
led to corrections and improvements: for the first edition, E. Barankin, S. 
Bochner, E. Parzen, and H. Robbins; for the second edition, Y. S. Chow, 
R. Cogburn, J. L. Doob, J. Feldman, B. Jamison, J. Karush, P. A. Meyer, 
J. W. Pratt, B. A. Sevastianov, J. W. Woll; for the third edition, S. 
Dharmadhikari, J. Fabius, D. Freedman, A. Maitra, U. V. Prokhorov. 
My warm thanks go to Cogburn, whose constant help throughout the 
preparation of the second edition has been invaluable. This edition has 
been prepared with the partial support of the Office of Naval Research 
and of the National Science Foundation. 

M.L. 

April, 1962 
Berkeley 、 California 
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Introductory Part 


ELEMENTARY PROBABILITY THEORY 


Probability theory is concerned with the mathematical analysis of 
the intuitive notion of “chance” or “randomness，” which, like all no¬ 
tions, is born of experience. The quantitative idea of randomness first 
took form at the gaming tables, and probability theory began, with 
Pascal and Fermat (1654)，as a theory of games of chance. Since then, 
the notion of chance has found its way into almost all branches of knowl¬ 
edge. In particular, the discovery that physical ^observables/* even 
those which describe the behavior of elementary particles, were to be 
considered as subject to laws of chance made an investigation of the 
notion of chance basic to the whole problem of rational interpretation 
of nature. 

A theory becomes mathematical when it sets up a mathematical 
model of the phenomena with which it is concerned, that is, when, to 
describe the phenomena, it uses a collection of well-defined symbols 
and operations on the symbols. As the number of phenomena, to¬ 
gether with their known properties, increases, the mathematical model 
evolves from the early crude notions upon which our intuition was 
built in the direction of higher generality and abstractness. 

In this manner, the inner consistency of the model of random phe¬ 
nomena became doubtful, and this forced a rebuilding of the whole 
structure in the second quarter of this century, starting with a formula¬ 
tion in terms of axioms and definitions. Thus, there appeared a branch 
of pure mathematics — probability theory — concerned with the construc¬ 
tion and investigation per se of the mathematical model of randomness. 

The purpose of the Introductory Part (of which the other parts of 
this book are independent) is to give “intuitive meaning” to the con¬ 
cepts and problems of probability theory. First, by analyzing briefly 
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ELEMENTARY PROBABILITY THEORY 


some ideas derived from everyday experience — especially from games of 
chance — we shall arrive at an elementary axiomatic setup; we leave the 
illustrations with coins, dice, cards, darts, etc., to the reader. Then, 
we shall apply this axiomatic setup to describe in a precise manner 
and to investigate in a rigorous fashion a few of the “intuitive notions” 
relative to randomness. No special tools will be needed, whereas in 
the nonelementary setup measure-theoretic concepts and Fourier- 
Stieltjes transforms play a prominent role. 




I. INTUITIVE BACKGROUND 


1. Events. The primary notion in the understanding of nature is that 
of event — the occurrence or nonoccurrence of a phenomenon. The ab¬ 
stract concept of event pertains only to its occurrence or nonoccurrence 
and not to its nature. This is the concept we intend to analyze. We 
shall denote events by B, C, … with or without affixes. 

To every event A there corresponds a contrary event ‘‘not to 
be denoted by A e \ A e occurs if, and only if, A does not occur. An event 
may imply another event: A implies B if, when A occurs, then B neces¬ 
sarily occurs; we write A C： B. If A implies B and also B implies A, 
then we say that A and B are equivalent; we write A — B. The nature 
of two equivalent events may be different, but as long as we are con¬ 
cerned only with occurrence or nonoccurrence, they can and will be 
identified. Events are combined into new events by means of opera¬ 
tions expressed by the terms “and,” “or” and “not.” 

A ''and'' B is an event which occurs if, and only if, both the event A 
and the event B occur; we denote it by H 5 or, simply, by AB. If 
AB cannot occur (that is, if A occurs, then B does not occur, and if B 
occurs, then A does not occur), we say that the event A and the event 
B are disjoint (exclude one another, are mutually exclusive, are in¬ 
compatible). 

A “or” is an event which occurs if, and only if, at least one of the 
events B occurs; we denote it by U B. If, and only if, A and B 
are disjoint, we replace “or” by +. Similarly, more than two events 
can be combined by means of “and，” “or ”； we write 

n 

為 n 為 n • • • n or a x a 2 … a or n 為， 

k^l 

n n 

為 U 為 U … U or U Aki A\ -V ''' or X 

k^l 

There are two combinations of events which can be considered as 
“boundary events ”； they are the first and the last events —— in terms of 

3 
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implication. Events of the form A A c can be said to represent an 
“always occurrence,” for they can only occur. Since, whatever be the 
event the events A A c and the events they imply are equivalent, 
all such events are to be identified and will be called the sure event, to 
be denoted by S2. Similarly, events of the form AA C and the events 
which imply them, which can be said to represent a “never occurrence** 
for they cannot occur, are to be identified, and will be called the impos¬ 
sible events to be denoted by 0; thus, the definition of disjoint events A 
and B can be written AB — 0. The impossible and the sure events are 
“first” and “last” events, for, whatever be the event we have 0 C 
A CL^t. 

The interpretation of symbols CZ,=, 门， U, in terms of occurrence 
and nonoccurrence, shows at once that 


if A a B, then B c a A% and conversely; 

AB = BA, A B ^ B A', 

{AB)C = A{BC), U U 5) U C = ^ U (5 U C)； 

J(B U C) = ^5 U AC, A U BC = (^ U B){A U C)； 

% 

{AB) C = J c U B% {A U B) c = A C B\ / U 5 = / + A C B\ 


more generally 

(n = u A k \ (U^) 

fcsasl fcsasl fc —1 

and so on. 

We recognize here the rules of operations on sets. In terms of sets, 
S2 is the space in which lie the sets A y B, C, …， 0 is the empty set, A c 
is the set complementary to the set A\ AB is the intersection^ A B 
is the union of the sets A and B, and A a B means that A is contained 
in B. 

In science, or, more precisely, in the investigation of “laws of nature,” 
events are classified into conditions and outcomes of an experiment. 
Conditions of an experiment are events which are known or are made to 
occur. Outcomes of an experiment are events which may occur when 
the experiment is performed, that is, when its conditions occur. All 
(finite) combinations of outcomes by means of ‘‘not，’’ “and,” “or,” are 
outcomes; in the terminology of sets, the outcomes of an experiment 
form a field (or an ‘‘algebra’’ of sets). The conditions of an experiment, 


m ， 
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together with its field of outcomes, constitute a trial. Any (finite) 
number of trials can be combined by ‘‘conditioning,” as follows: 

The collective outcomes are combinations by means of “not ,，， 
“and,” “or,” of the outcomes of the constituent trials. The condi¬ 
tions are conditions of the first constituent trial together with condi¬ 
tions of the second to which are added the observed outcomes of the 
first, and so on. Thus, given the observed outcomes of the preceding 
trials, every constituent trial is performed under supplementary condi¬ 
tions: it is conditioned by the observed outcomes. When, for every 
constituent trial, any outcome occurs if, and only if, it occurs without 
such conditioning, we say that the trials are completely independent. 
If, moreover, the trials are identical, that is, have the same conditions 
and the same field of outcomes, we speak of repeated trials or, equiva¬ 
lently, identical and completely independent trials. The possibility of re¬ 
peated trials is a basic assumption in science, and in games of chance: 
every trial can be perforined again and again, the knowledge of past and 
present outcomes having no influence upon future ones. 

2. Random events and trials. Science is essentially concerned with 
permanencies in repeated trials. For a long, time Homo sapiens investi¬ 
gated deterministic trials only, where the conditions (causes) determine 
completely the outcomes (effects). Although another type of perma¬ 
nency has been observed in games of chance, it is only recently that 
Homo sapiens was led to think of a rational interpretation of nature in 
terms of these permanencies: nature plays the greatest of all games of 
chance with the observer. This type of permanency can be described 
as follows: 

Let the frequency of an outcome A \n n repeated trials be the ratio 
nA/n of the number of occurrences of A to the total number n of 
trials. If, in repeating a trial a large number of times, the observed 
frequencies of any one of its outcomes A cluster about some number, 
the trial is then said to be random. For example, in a game of dice (two 
homogeneous ones) “double-six” occurs about once in 36 times, that 
is, its observed frequencies cluster about 1/36. The number 1/36 is a 
permanent numerical property of “double-six” under the conditions of 
the game, and the observed frequencies are to be thought of as measure¬ 
ments of the property. This is analogous to stating that, say, a bar 
at a fixed temperature has a permanent numerical property called its 
“length” about which the measurements cluster. 

The outcomes of a random trial are called random . (chance) events. 
The number measured by the observed frequencies of a random event 
A is called the probability of A and is denoted by PA. Clearly, P0 = 0, 
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Pil = 1 and, for every A y 0 ^ PA ^ 1. Since the frequency of a sum 
A\ + 為 - h A n of disjoint random events is the sum of their fre¬ 

quencies, we are led to assume that 

+ ^2 + • • • + -<^n) = + P^2 +... + PA n . 

Furthermore, let » 乂，》月 ， 《ab be the respective numbers of occurrences 
of outcomes B, AB in n repeated random trials. The frequency of 
outcome B in the trials in which A occurs is 


nAB ^ab 

—— - -:—— 

riA n n 

and measures the ratio PAB/PA^ to be called probability of B given A 
(given that A occurs); we denote it by PaB and have 

PAB = PA-PaB. 

Thus, when to the original conditions of the trial is added the fact that 
A occurs, then the probability PB of B is transformed into the proba¬ 
bility PaB of B given A. This leads to defining B as being stochasti¬ 
cally independent of A if PaB = PB or 

PAB = PAPB. 

Then it follows that A is stochastically independent of B, for 


PbA 


PAB 

~PB 


PA, 


and it suffices to say that A and B are stochastically independent. (We 
assumed in the foregoing ratios that the denominators were not null.) 

Similarly, if a collective trial is such that the probability of any out¬ 
come of any constituent random trial is independent of the observed 
outcomes of preceding constituents, we say that the constituent ran¬ 
dom trials are stochastically independent. Clearly, complete independ¬ 
ence defined in terms of occurrences implies stochastic independence 
defined in terms of probability alone. Thus, as long as we are concerned 
with stochastic independence only, the concept of repeated trials re¬ 
duces to that of identical and stochastically independent trials. 

3. Random variables. For a physicist, the outcomes are, in general, 
values of an observable. From the gambler’s point of view, what 
counts is not the observed outcome of a random trial but the corre¬ 
sponding gain or loss. In either case, when there is only a finite num¬ 
ber of possible outcomes, the sure outcome S2 is partitioned into a num- 
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ber of disjoint outcomes A u A 2i . ■ . ， A m . The random variable X, say, 
the chance gain of the gambler, is stated by assigning to these outcomes 
numbers x Av x At , ... ， x An , which may be positive, null, or negative. 
The “average gain” in » repeated random trials is 




+ 


Since the trial is random, this average clusters about x Ax PA\ + x A JPJ 2 

H - H x Am PA m which is defined as the expectation EX of the random 

variable X. It is easily seen that the averages of a sum of two random 
variables X and Y cluster about the sum of their averages, that is, 

E{X+ Y) = EX + EY. 


The concept of random variable is more general than that of a random 
event. In fact, we can assign to every random event A a random vari¬ 
able — its indicator /^ = 1 or 0 according as A occurs or does not occur. 
Then, the observed value of I a tells us whether or not A occurred, and 
conversely. Furthermore, we have El a = 1 • P/ + 0. PA e = PA. 

A physical observable may have an infinite number of possible values, 
and then the foregoing simple definitions do not apply. The evolution 
of probability theory is due precisely to the consideration of more and 
more complicated observables. 






II. AXIOMS ； INDEPENDENCE AND THE 
BERNOULLI CASE 



We give now a consistent model for the intuitive concepts which ap¬ 
peared in the foregoing brief analysis; we shall later see that this model 
has to be extended. 

1. Axioms of the finite case. Let or the sure event be a space of 
points w; the empty set (set containing no points w) or the impossible 
event will be denoted by 0. Let (J be a nonempty class of sets in to 
be called random events or, simply, events t since no other type of events 
will be considered. Events will be denoted by capitals A y B, •. • with 
or without affixes. Let P or probability be a numerical function de¬ 
fined on (J; the value of P for an event A will be called the probability 
of A and will be denoted by PA. The pair (a, P) is called a probability 
field and the triplet (fl, Ct, P) is called a probability space. 

n 

Axiom I. (J is a field: complements A c y finite intersections fl 

k"" 1 
n 

and finite unions U Ak of events are events. 

1 

Axiom II. P on Q is normedy nonnegative, and finitely additive: 


Pfi = 1, PA ^ 0, PZ^k = Z P^k. 

It suffices to assume additivity for two arbitrary disjoint events, since 
the general case follows by induction. 

Since 0 is disjoint from any event A and / + 0 = /， we have 

PA = P(J + 0) = + P0, 

so that P0 = 0. Furthermore, it is immediate that, A (Z B y then 
PA ^ PB y and also that 

n n 

P U ^ = 户為 + PA\A2 + • • • + PA\ A 2 • • • A n ^\ A n ^ 52 

A ； awl k = l 

The axioms are consistent. 







AXIOMS ； INDEPENDENCE AND THE BERNOULLI CASE 


9 


To see this, it suffices to construct an example in which the axioms 
are both verified: take as the field a of events and 0 only, and set Pfi 
=1, P0 = 0. A less trivial example is that of a simple probability field: 
1° The events, except 0, are formed by all sums of disjoint events 
^ 2 > .. .，which form a finite partition of the sure event: A x + A 2 
H - H An = 2° to every event A k of the partition is assigned a 

n 

probability pk = P^k such that every ^ 0 and Up* = 1 — this is 

k=l 

always possible. Then P is defined on a, consistently with axiom II, 
by assigning to every event / as its probability the sum of probabilities 
of those A\t whose sum is A. 

2. Simple random variables. Let the probability field (d, P) be 
fixed. In order to introduce the concept of random variables, it will be 
convenient to begin with very special ones, which permit operations on 
events to be transformed into ordinary algebraic operations. 

To every event A we assign a function I A on with values 
such that /^(w) = 1 or 0 according as w belongs or does not belong to 
A\ I a will be called the indicator of A (in terms of occurrences, /^ = 1 
or 0, according as A occurs or does not occur). Thus, I a' = I a and 
the boundary cases are those of / 0 = 0 and I Q = \ (if, in a relation 
containing functions of an argument, the argument does not figure, 
then the relation holds for all values of the argument unless otherwise 
stated). 

The following properties are immediate: 

if A <Z B, then Ia S Jb ， and conversely; 

if A = B, then I a = Ib, and conversely; 

^A e = 1 一 I a, Iab = IaIb^ Ia+b = /a + /s ， 

Ia\} b = Ia+a c b = I a Ib — Iab 

and, more generally, 

n n 

^ « = II ^Aky ^ ^ = 53 ^Ak 

A:— 1 A:— 1' 

I n = Ia x + (1 — Wa 2 + … + (1 — / 山 ）… （1 — lAn-l^An- 

U 

A:— 1 

m 

Linear combinations X = ^ XjI A . of indicators of events At of a finite 

i = i 

partition of where the Xj are (finite) numbers, are called simple 
random variables^ to be denoted by capitals X, Y, .. .， with or without 
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affixes. By convention, every written linear combination of indicators 
will be that of indicators of disjoint events whose sum is the sure event; 
however, when Xj = 0, we may drop the corresponding null term XjI A . 
=* 0 from the linear combination. The set of values PAj which corre¬ 
spond to the values xj of X, assumed all distinct, is called the proba¬ 
bility distribution and the A t form the partition of X. The expectation EX 

m 

of a simple random variable X ^ XjlAj is defined by 

/-i ^ ^ 

m 

EX =Z XjPJj. 

/-i 

Clearly, any constant f is a simple random variable, and the sum or the 
product of two simple random variables is a simple random variable; 
E{c) — c, EcX = cEX\ if X ^ 0, that is, all its values Xj ^ 0, then 
EX ^ 0; If X ^ Y, then EX ^ EY. Furthermore, expectations pos¬ 
sess the following basic property. 

Addition property. The expectation of a sum of {a finite number of) 
simple random variables is the sum of their expectations. 

It suffices to prove the assertion for a sum of two simple random vari¬ 
ables 

m n 

A ： = E 7 = E ykiB k , 

/™1 k^ml 

since the general case follows by induction. Because of the properties 
of probabilities and indicators given above, 

m n m n 

+ = Z XjP^j + JLykPBk = Z) Z (^j + yk)P^jB k 

y™i y»i 

while 

E(X+Y)^EZ Z(xj+ y k )I AjBk 

/-I k^l 
m n 

—H H ( x j + yk)P^jBk> 

j 鍾 1 

and the conclusion is reached. 

Application to probabilities of combinations of events. To begin with, 
we observe that 

El A = l-P/^ + 0PA e = PA. 

Therefore, from 

I A UB = /A + /b — IaB 
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it follows, upon taking expectations of both sides, that 
P{A B) = PA + PB - FAB. 

Similarly, from 

■Tausuc = /a + (1 - Ia)Ib + (1 - /a)(1 - Ib)Ic 

it follows, upon expanding the right-hand side and taking expectations, 
that 

P{A B U C) = PA + PB + PC - PAB - PBC - PCA + PABC, 
and so on. 

The foregoing properties of expectations lead to the celebrated 

Tchebichev inequality. If X is a simple random variable t then t for 
every « > 0, 

P[\X\ ^ e] ^ -i EX 2 . 

[I 11 g «] is to be read: the union of all those events for which the 
values of 丨 11 are 2 «. 

The inequality follows from 

EX 2 = E(X 2 I[\x\^ t ]) + E(X 2 I[\x\<,]) ^ £(X 2 / [|x|s*]) ^ 

=e 2 P[\ X| ^ ,]. 

3. Independence. Two events A\ y A^. are said to be stochastically 
independent or, simply, independent (no other type of independence of 
events will be considered) if 

PA\A2 = PA\PA^ 

More generally, events k = 1, 2, .. .，》are independent^ if, for every 
m ^ n and for arbitrary distinct integers 是 i ， 是 2 ， • • • > = w > 

PA kx A ki … 為 „ = PA kx PA kt • • - PA^. 

If this property holds for all events A k selected arbitrarily each within 
a different class Ct*；, we say that these classes are independent. Simple 
random variables X ki k = 1, 2, ...， w, are said to be independent if the 
partitions on which they are defined are independent. A basic prop¬ 
erty of independent simple random variables is the following 

Multiplication property. The expectation of a product {of a finite 
number) of independent simple random variables is the product of their 
expectations. 
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It suffices to give the proof for two independent simple random variables, 

m n 

x = m XjI Ah y = X) yklB ki all Xj(y k ) distinct, 

k=l 

since the general case follows by induction. Because of independence, 

= £ Z Z Xjy k I AjBk = X) X) Xjy k PAjPB k 
/ =1 y=i a；_i 

m n 

=(E^/)( Zy k PB k ) = EXEY 、 

y-i k^\ 

and the conclusion is reached. 

The expectation E(X — EX) 2 , called the variance of X y is denoted 
by a 2 X. By the additive property, 

^X= E(X 2 - 1XEX+ E 2 X) = EX 2 - E 2 X. 

The celebrated Bienayme equality follows from the additive and mul¬ 
tiplicative properties. 

Bienaym^ equality. If X ki k = l y 2 , •••,», are independent^ then 


Since 


/ Z 石 =Z a 2 x k . 

ksael 

E(X k - EX k ) = EX k - EX k = 0 


and independence of the Xk implies independence of the Xk — EXk, 'it 
follows that 

<r 2 ZX k = E(ZX k -Z EX k ) 2 = E\Z (X k - EX k )} 2 


=E E{X k - EX k ) 2 + Z E(Xj - EXi){X k - EX k ) 

=Z <r 2 X k + Z E(X S - EXj)E(X k - EX k ) = Z c 2 X k . 

Observe that we used independence of the Xk considered two by two 
only. 

4. Bernoulli case. A simple case of independence has played a cen¬ 
tral role in the evolution of probability theory. This is the Bernoulli 
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case of events 是 =1, 2, . • .，which are independent whatever be 
their total number n under consideration and such that their probabili¬ 
ties PAk have the same value p. 

We observe that independence of the A]^ 是 =1, 2, » implies 

independence of the J ?*； = Ak or Aky and, more generally, of the n 
fields dk = { 0 > 為 e ， fi}. For example, 

P - - - = PA k7 A ki - - - A]^ — PA kx A kt - - • A kn 

= P dkt ... P^k m — P A kl PA ki - - - PA km 

=(1 - PA k ^PA ki • • • PA km = PA kx c PA ki • • • 

where the subscripts are all distinct and ^ n. These fields correspond 
to repeated random trials where an outcome A at the /(rth trial is repre¬ 
sented by A\t. 

The number of occurrences of outcome A \n n repeated trials is rep- 

n 

resented by a simple random variable S n = 'Z, 人 “. To write S n in the 

1 

usual form, that is, with values assigned to events of a partition of the 
sure event, we observe that 

n 

^Ak = I Ah II {IAi + IAi c ) - 


It follows, upon substituting in *S" n and expanding, that 

n 


where 


= Z lAk x ' - - lA k . ^A k . +l c - - - lAk n c ' 


The summation is over all permutations of subscripts k = 1, 2, 
classified into two groups, one having j terms and the other having 
n — j terms. 

On account of the independence, the expectations of the terms under 
the summation sign are 

PA kx PA k2 • • - PA ki PA k .J • • - PAC = pY~\ ? = 1 - p, 
and, therefore, the probability of j occurrences in n trials is given by 


P\Sn = j] = PBj 


n\ 


'!(» -y)! 


fd i = 0,1 ， 
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With this result we can compute directly the expectation and variance 
of S ny but we prefer to use the additive property which gives 

n n 

= EH IA k = Y. P^k = »p, 

and the Bienaym6 equality for independent random variables I Ak which 
gives 

n 

<r 2 S n = Z a 2 I At = npq ， 

ksaeX 

since 

<r 2 I Ak = E(I Ak - EI Ak ) 2 = EI Ak 2 - E” Ak 
= EI Ak - E 2 I Ak = P - p 2 = pq. 

In orHer to justify the model investigated so far, we ought to give a 
precise and acceptable “meaning” to the notion of “clustering of fre¬ 
quencies” which, as we have seen, is at the very root of the interpreta¬ 
tion of randomness. The most celebrated interpretation, and rightly so, 
is the following 

Bernoulli law of large numbers (1713). In the Bernoulli case 、 
for every « > 0, » —> «, 

= «1 — > 0. 


P 


Sn 

n 


P 


In other words, the probability distribution of values of the frequency 
S n /n of an outcome in n repeated trials concentrates at the value p of 
the probability of the outcome, as the number of trials increases in¬ 
definitely. 

The proof is immediate for, upon applying the Tchebichev inequality, 


we have, as n 


P 

Sn 

-- p 

n 

^ € 


P[\S n -ES n \ ^ m]^-^- 2 a 2 S n 


pq 


0. 


Observe that only independence two by two has been required. 

A particular sequence of Bernoulli cases, introduced by Poisson, 
shows that the finite setup considered so far is not satisfactory, at least 
from the sophisticated mathematician's point of view. 

Consider a sequence of Bernoulli cases of independent events A n iey 
k = 1, 2, = 1,2, • • •, of the same probability p n which varies 
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with the number n of trials in such a manner that the expectation of 

n 

the number of occurrences S n = ^A„ k remains constant: ES n = np n 

1 

=X. Then, as » —> « while j remains fixed, 


P[Sn = j] 


n\ 




»(» — l) •••(»— y + i) /XV 


yi 




( 1_ D 



and. we have the following 

n 

Poisson theorem (1832). 1/ S n = /^ Bt is the sum of indicators of 

1 

independent equiprobable events ， such that the expectation ES n = X > 0 
remains constant as n varies ， then, as n «> y 


Since 


P[u — _/ = 0 ， l ， 2, .... 

l\ 

00 v 00 V 

Z = e~ x Z-=h 

j 现 Q J * j =«0 J * 


we can say that, in the foregoing passage to the limit, no positive proba¬ 
bility escapes to infinity. The total probability is now distributed 
among a denumerable number of values _/ = 0, 1, 2, …， provided we 
assume that the probability of the sum of a denumerable number of 
disjoint events [*y n = j\ is the sum of their probabilities. However, in 
the setup of § 1 neither a denumerable sum of events nor the property 
just stated has content. Thus, if we want to give an interpretation to 
Poisson’s result, we have to expand the model so as to include the pre¬ 
ceding possibilities. 

5. Axioms for the countable case. As soon as the concept of infinity 
appears, intuition fails and the vague everyday idea of randomness 
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yields nothing. A first and obvious way to pass from the finite to the 

infinite is to extrapolate, that is, to postulate that properties of the 

finite case continue to hold in the infinite case. Yet these extrapolations 

have to be meaningful and consistent. 

00 00 

In set theory, intersections f| and unions (J A n of sets A ny where 

n=l n»l 

n runs over the denumerable set of integers, continue to be defined as 
the sets of points which belong to every A n and to at least one A n> re¬ 
spectively. We still have that 

00 00 00 00 

(n A n y = U (U A n y = n A n \ 

n=l n=l . n=l n^l 

00 

A n = A\ AxA^ + +... ad infinitum 

n=l 

and, correspondingly, 

00 00 

I oo = II ^A n y ^ oo = I An 

fl n = l 2 n=s l 

1 n— I 

I « = Iai + Iai c Ia 2 + + .... 

U A n 

n— 1 

If we want all countable (finite or denumerable) combinations of events 
by means of “not,” “and,” “or,” to be events and their probabilities 
to be defined, then axioms I and II become 

Axiom V. Events form a afield <i: Complements A c y countable in¬ 
tersections fl and countable unions \J Aj of events are events. 
i • i 

Axiom IF. Probability P on d is normed y nonnegative y and a-additive: 
PQ = 1, PJ^O y 

i j 

It follows that 

Covering rule; P |J /fy = P 為 + PA \ A 2 + PA \ A<i + … 
i 

These axioms are consistent, since the examples constructed for the 
finite case continue to apply trivially. A nontrivial example in the in¬ 
finite case is that of nonsimple elementary probability fields\ 1° The 
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events, except 0, are formed by all countable sums of events A n which 

00 

form a denumerable partition of the sure event: X) /^„ = 2° to 

n* 1 

every event A n of the partition is assigned probability p n = PA n such 

oo 

that every p n ^ 0 and H= 1 — this is always possible. Then P is 

n® 1 

defined on a, consistently with axiom it^, by assigning to every event 
A as its probability the sum. (finite sum or convergent series) of proba¬ 
bilities of those A n whose sum is A. 

6. Elementary random variables. A linear combination X = 

Ih x j^Aj of a countable number of indicators of disjoint events Aj is an 
i 

elementary random variable X; if j varies over a finite set, then X re¬ 
duces to a simple random variable. Clearly a sum or a product of two 
elementary random variables is an elementary random variable. We 
may still try to define the expectation EX by 

EX = ZxjPJj. 

But, if the sum is a divergent series, it has no content or is infinite. 
Furthermore, even if it is a conver 明 nt series, it may not be absolutely 
convergent, so that by changing the order of terms we can change its 
value, and the expectation is no longer well defined if no ordering is 
specified; this is undesirable according to the very meaning of an ex¬ 
pectation. We are therefore led to define EX by the foregoing expres¬ 
sion only when the right-hand side is absolutely convergent^ so that 

if EX exists and is finite 、 then E\ X | exists and is finite; and conversely. 

(We recognize here an integrable elementary function in the sense of 
Lebesgue with respect to the measure P.) 

The argument used to prove the addition property of simple random 
variables continues to apply to finite sums of elementary random vari¬ 
ables whose expectations exist and are finite, provided <r-additivity of 
P is used. We obtain: 

If the expectations of a finite number of elementary random variables 
exist and are finite、then the expectation of their sum exists and is finite 
and is the sum of their expectations. 

Also, Tchebichev’s inequality remains valid, provided its right-hand 
side exists and is finite. 

Independence of a countable number of events or <r-fields Gy con¬ 
tained in (J, is defined to be independence of every finite number of 
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thesq events, or <r-fields. Independence of a countable number of ele¬ 
mentary random variables Xk = 'Z, x jk I Ajk is defined to be independence 

i 

of every finite number of events as k varies. The argument used 
to prove the multiplication property yields: 

If the expectations of a finite number of independent elementary random 
variables exist and are finite、then the expectation of their product exists 
and is the finite product of their expectations. 

Also, Bienaym^s equality remains valid, provided its right-hand side 
exists and is finite. 


In the Bernoulli law of large numbers only simple random variables 
figure jfnd only finite additivity of the probability P is used, so that 
nothing is to be changed. However, now we can introduce probabilities 
of denumerable combinations of events and use the supplementary re¬ 
quirement that the additive property of P remains valid for denumera¬ 
ble sums. Therefore, in the present setup we can expect a more pre¬ 
cise interpretation of the “clustering of frequencies.” This is the cele¬ 
brated Borel strong law of large numbers derived below. 

Let Xi, X 2 , … be a sequence of elementary random variables. We 
investigate the convergence to 0 of the sequence; the limits are taken 
as » —> 00 . It will be more convenient to consider the contrary case — 
X n does not converge to 0 or, equivalently, there exists at least one in¬ 
teger m such that to every integer n there corresponds at least one in¬ 


teger v for which I X n+V | ^ 土 . Since “at least one” corresponds to 
“U” while “every” corresponds to “ 门 ，” we can write 


[uo]= u n u 


I x n ^. v I ^ 


mj 


the right-hand side is an event. Thus, the condition X n 0 deter¬ 
mines the event [X n +> 0], the contrary condition X n 0 determines 
the complementary event [X n 0], and the probabilities of these two 
events add up to 1. 

We are interested in > 0 with probability 1 or, equivalently, 
0 with probability 0, and require the following proposition. 


If, for every integer 23 P 


X n \ ^ 


m 


< 00 , then P[X n -f> 0] = 0. 


We set A nri 


u 


I x n+v I ^ 


m . 


and A m — and observe that, 
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by the covering rule and the hypothesis, for every m i 

oo i - 

P^nm = P U |^|^- 

oo i - 

^ 22 P I ^ I = — — > 0 as » —> oo. 

Since whatever be »'， 

00 

P^m = P 0 ^nm ^ Pdn.m ， 

TlssmX 

it follows upon letting > oo that PA m = 0. Therefore, by the cov¬ 
ering rule 

00 00 

P[Xn 0] = P\JJ m ^ZP^m = 0 

771 ®1 Was* 1 

and the proposition is proved. 

We can now pass to 

Borel’s strong law of large numbers (1909). In the Bernoulli 
case 



We recall that in the Bernoulli case 

S n 1 n 

X n = = — S ^Aj 

n n 

where the Aj are independent events of common probability p whatever 
be «， and EX n = p, <r 2 X n = pqjn (observe that only independence two 
by two is used). Since for every m 

00 p I *1 00 I 

HP I - p I ^ ^ m 2 pq Z Ti < °°> 

A；-l L 汸」 *!-l k 

it follows by the foregoing proposition that Xj^ —> p with probability 1 
as 是 一 > «• But to every n there corresponds an integer k = k{n) with 
k 2 ^ n < (k - l) 2 ; hence 0 ^ n — k 2 ^ 2k and « —>» implies k —* <x>. 
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Since 


|^n-^|=|(j-^) + ； Z + /a 




so 


that 


nk 2 


n 


X n — p \ ^. \ X n — X k t I + I Xj^t — p I ^ - + I X k t — p I ， 


it follows that X n —* p with probability 1 as « —> and Bord’s re¬ 

sult is proved. 

* Application. Let X be an elementary random variable. We set 
F(x - 0) = F(x) = P[X < x], F(x + 0) = P[X ^ x] so that P[X = 
x] = F(x + 0) — F(x). The function F so defined determines the prob¬ 
ability distribution of X, that is, the probabilities of all values of X; 
it is called the distribution function of X. We organize repeated inde¬ 
pendent trials where we observe the values of X\ in other words, we 
consider independent random variables Xu X 2 , - * - with the same prob¬ 
ability distribution as X. 

If k is the number of values observed in n of those trials and which 
are less than x or, equivalently, if k is the number of independent events 
[Xi < x], [X 2 < x]y ••- [X n < x] (with common probability p = F(x)) 
which occur, we set F n (x - 0) = F n (x) = k/n. Thus, F n (x) is a ran¬ 
dom variable with 

P F n (x) =- = —- {F(x)} k {l - F(x)} n ~ k . 

_ « 」 k\(n — k)\ 

The function F n is called empirical distribution Junction of X in n trials. 
According to Borers strong law of large numbers, this frequency F n (x) 
of occurrences of the outcome [X < x] converges to F(x) with proba¬ 
bility 1. In other words, the observations permit us to find with prob¬ 
ability 1 every value F(x) of the distribution function of X. In fact. 
Borers result yields more (Glivenko-Cantelli) : 

Central statistical theorem. If F is the distribution function of 
a random variable X and F n is the empirical distribution junction of X in 
n independent and identical trials, then 

P[ sup I F n (x) — F(x) I - > 0] = 1. 

— ao <X <+ » 

In other words, with probability 1, Fn(^) ^ F(x) uniformly in x. 
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Let x } -k be the smallest value x such that 

F(x) + 

k 

Since the frequency of the event [X < x jh ] is F n (x jk ) and its probability 
is F(x }k ), it follows by Borers result that PJ} k = 1 where ^ = [FJ^ 
— F(x jk )]. Similarly，= 1 where 為 = [ 尸“印 + 0) -> F(x ik + 
0]. Let Ajk = ^jk^jk and let Q = ±0 

k 

A = n 為 k = f sup I F n (x jk + 0) - F(x jk + 的 I —> 0]. 

By the covering rule and by what precedes 

ib k 

P^k c = p IMV 5 s E P^ik c = o 

ao 

and ， hence, PAu = 1. Upon setting A = f| 為 ， it follows similarly 

that PA =1. k 

On the other hand, for every x between Xjk and xy +li * 

F(j<jk + 0) ^ F(x) ^ 尸 C^+i,*) ， F n (x ik + 0) ^ F n (x) ^ F n (x } - +ltk ) 

while for every Xjk 

0 $ F(x } - + i,k) — F(xjk + 0 ) ^ -• 

k 

Therefore, 

F n (x) — F(x) ^ F n (xj- +1 ,k) — F(x } k + 0) ^ F n (x j+l ,k) — + y 

k 

and 

F n {x) — F(x) ^ F n (xjk + 0) — F(xj + i t k) 

=F n (x } -k + 0) — F(Xjk + 0) — -• 

k 

It follows that, whatever be x and k, 

I F n(x) - F(x) I ^ sup I F n (x ik e) - F(Xjk + 0) I + i 
isj.9 k 

or 

I 

△n = sup I F n (x) - F(x) I ^ sup I F n (x Jk + 6) - F(x,-k + 0) | + 

-«><x< + «> l^jgk k 

Hence P[A n —> 0] ^ PA = 1, and the theorem is proved. 

♦Remark. The foregoing proof and hence the theorem remain valid 
when the random variable X is not elementary. 
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7. Need for nonelementary random variables. The sophisticated 
mathematician prefers to work with “closed” models — such that the 
operations defined for the entities within the model yield only entities 
within the model. While elementary random variables can be obtained 
as limits of sequences of simple random variables, all limits of se¬ 
quences of simple and, more generally, of elementary random variables 
are not necessarily elementary — families of elementary random varia¬ 
bles are not necessarily closed under passages to the limit. If this clo¬ 
sure is required, then the concept of a random variable has to be ex¬ 
tended so as to include ‘‘measurable functions.” This will be done in 
the following parts. In fact, the need for further expansion of the model 
in order to include random variables with a noncountable set of values 
appeared quite early in the development of probability theory, once 
more in connection with the Bernoulli case. This is the celebrated (as 
the reader observes, all results obtained in or used for the Bernoulli 
case are “celebrated ”） 


De Moivre-Laplace theorem. In the Bernoulli case with p > 0, 
g = I — p > 0 y as n —^ 


de Moivre (1732): 

Pn( x ) = = j\ 




np 


\/ l.irnpq npq 

uniformly on every finite interval [a ) b] of values of x; 


Laplace (1801): 


P 


a ^S 1 -np^ b 

y/npq 


wJ ： 


r*/2 


dx. 


The relation a n 〜 b n means that a n /b n —> 1. The integer j varies 
with », so that x = x(n) remains within a fixed finite interval [a t b] and 


j ：= np + x^/ npq —* oo, k = n—j = nq — x\/ npq 
We apply Stirling’s formula 


m 


! = o <e m < 


12m 


n\ 


to the binomial probabilities P n (x) = —— Thus 

i\k\ 
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P.w = 了斤二 kP ^ - 

y/l^Kj-fe 3 \/2irk-k k e k 




where, uniformly on [a, 占]， 


M< f 2 G+}+:) 
i =r2 ( p+x 撼("撼 〜 


npq* 


Therefore, uniformly on [a y b] y 


八 w 〜 


y/2irnpq \ j / \ k / 


log ( 子 ) © = -㈨ + xV ^^[ x 

— (nq — x\^npq ) —x 

= _ T + 0 (^)* 


1 qx 2 

np 2 np 


(i). 


I! 一 

nq 2 nq 


(i)_ 


= - T + o 

The first assertion follows. 


Let x nJ - be those numbers of the form 


which belong to the in¬ 


terval fa, b]\ consecutive x n /s differ by 1/V" npq. On account of the 
first assertion, uniformly in j, 

PnM 〜 - 7-, 1 …… 厂々 /2 

and W^npq 

S n — np "1 1 1 2 .. 

P a 么一 ~ r J=4b = z PnM 〜一 7== - -7= Z n//2 . 

- v npq 」 j v 2ir v n p9 j 

Since the last expression is a Riemann sum approximating the integral 

一 I e^ x2/2 dx 、the second assertion follows. 

V2ir J a 




III. DEPENDENCE AND CHAINS 


1. Conditional probabilities. Let A be an event with PA > 0. The 
ratio PABfPA is called the conditional probability of B given A or, 
simply, probability of B given A and is denoted by PaB, so that 

PAB = PAP a B. 

By induction we obtain the multiplication rule: 

P(AB ... KL) = PAPaB … Pab • kL. 

Furthermore, if $2 Aj = then, from 

’ PB = P^lB = Z PJjB, 

} 

follows the total probability rule: 

PB = Z P^iP Ai B. 

} 

Bayes’ theorem 、 

p j = 

k ~ Z P^P Ai B' 

i 

follows upon replacing PB by the foregoing expression in the relation 

PA\iB = PAkP Ak^ = -f* 

All events which figure as subscripts are supposed to be of positive 
probability. However, t/, say t PaB is given, then every given PA y whether 
zero or not, determines correctly PAB by PAB = PAPaB, since PA = 0 
implies PAB = 0. 

The set of all probabilities of events given a fixed A with PA > 0 
defines a function Pa on Q,, to be called the conditional probability given 
A or, simply, the probability given A. It follows at once from the defi¬ 
nition that Pa obeys axiom II ’： it is normed, nonnegative, and <r-addi- 
tive on a. Therefore, the pair (Ot, Pa) is a probability field given A 
for which all definitions and general properties of probability fields re- 

^ 24 
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main valid. In particular, if X = x jl is an elementary random 

i 

variable, the expectation of X with respect to Pa or the conditional ex¬ 
pectation of X given A or, simply, the expectation of X given A is defined 
by ! 

EaX = X) XjP A A,- = — X) XjPAAj ； 
i i 

clearly, if EX exists and is finite, then EaX exists and is finite. In 
terms of trials, the probability field given A represents the original 
trial with the occurrence of outcome A added to the original conditions. 

It is easily verified that the events Aj of a countable set are inde¬ 
pendent if, and only if, for every finite subset j\, of indices 

尸 W ... 乂， •“《*)= 


provided the “given” events have positive probability. 

2. Asymptotically Bemoullian case. Let A ny « = 1, 2, 


be an 


arbitrary sequence of events, and let X„ = — ^2 De the random fre¬ 




quency of occurrence of the n first ones. We set 


1 " 2 

pi(«) = - Z p2(») = --- Z PAjA k 

n k=i n{n — 1) \si<k^n 

so that p\{n) andp 2 («) are bounded by 0 and 1. It follows, by elemen¬ 
tary computations, that 

EX n = Pi(n), a 2 x n = p 2 (n) - p x 2 {n) + ㈣ - 二 , 2 ⑻ . 

n 

In the Bernoulli case 


dn = p2(«) — pl 2 (w) = p 2 — p 2 = 0, 

and we can consider the quantity d n as some sort of measure of “devia¬ 
tion” from the B;ernoulli case. To make this precise, let us first prove a 

Kolmogorov inequality. 'If X is an elementary random variable 
bounded by 1 (in absolute value) ^ therefor every c > 0, 

P[\ X| ^ e] ^ EX 2 - e 2 . 

We proceed as for the proof of Tchebichev’s inequality: the inequality 
follows from 


EX 2 = E(X% mt] ) + E(X 2 I im<t] ) ^ EI mt] + e 2 
= P[\ X\^e]-^ e 2 . 
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Extended Bernoulli law of large numbers. Bernoulli's result 、 
that for every c > 0 

^ v P[\X n -EX n \^e]-^ 0, 

remains valid for the sequence of events A ny independent or not, if, and 
only if, 

d n = p2{») — pi 2 («) -> 0. 

Since I I S 1， we can apply Kolmogorov’s inequality as well as Tche- 
bichev’s，so that 

况 -e 2 ^ P[| X n - EX n I ^ e] g <r 2 X n /e 2 . 

Therefore, the asserted property holds if, and only if, a 2 X n —> 0. But 

|况- 4| = 1_)-_)1 ^ — 0 ， 

n n 

and the extension follows- 

If d n ^ 0 at least as fast as l/«, then (asymptotically) we are even 
“closer” to the Bernoulli case. In fact. 

Extended Borel strong law of large numbers. If d n = 0(l/n) y 
then Borel s result remains valid: 

P[X n - EX n - > 0] = 1. 

The hypothesis means that there exists a fixed finite number c such that 

I nd n I ^ c. Upon referring to the proof of Borers result, we observe 

00 

that it suffices to show that 22 Since 

1 

nc 2 X n ^ I I + I pi ⑻ - p 2 («) I ^ f + 1, 
it follows by setting n = k 2 that 

00 00 1 
22 djt* = (^ + l) 12 7^ < °°> 

and the extension follows. 

It is easily shown that both extensions apply to the events A n which 
are independent but otherwise arbitrary. 

3. Recurrence. The decomposition 

<^ X n = 外⑻- P ! 2 ⑻ + : P2( ”) 

n 

yields at once a proposition which leads very simply to the celebrated 
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Poincare's recurrence theorem and its known refinements. Since 
<^ 2 X n ^ 0 and pi(n), p 2 (”） are bounded by 0 and 1, it follows that, for 
any fixed « > 0, if w ^ l/«, then 

p2(») = P\{n) -f ~~ + <r 2 X n ^ pi 2 (n) - - ^ pi 2 (”）_ 

n n 

But p 2 (”）is the arithmetic mean of PAjA k for 1 ^ j < k ^ n. There¬ 
fore, 

Whatever be the events A ni if n ^ 1/e, then there exist at least two events 
A ky \ ^ j < k ^ n, such that PAjA k ^ p\{n) — «. 

In particular, if PA n ^ p > 0 whatever be w, then every subsequence 
of these events contains at least two events A k such that PAjA k ^ 
p 2 — «； if this inequality holds, we say that Aj “e-intersects” A k . In 
fact, there exists then a subsequence whose first term «-intersects every 
other term. For, if there is no such subsequence, then there exist inte¬ 
gers m n such that no event A n «-intersects events A n ， with n' ^ n m ny 
no two events of the subsequence A nv A nty 人 ，… with ri\ = 1, w 2 = 
+ w ni , w 3 = w 2 + «-intersect, and this contradicts the par¬ 

ticular case of the foregoing proposition. Thus, let A n> A 2 u 為 i ,...， 
be a subsequence such that the first term «-intersects every other term. 
Let y/ 12 , A 22 , 為 2 ，…， be a subsequence of A^u 為 1 ， ...，with same 
property, and so on indefinitely. The sequence y/ 12 , ..is such 
that every one of its terms «-intersects every other term. Hence 

Recurrence theorem. If PA n ^ p > 0 whatever be n y then for every 
e > 0 there exists a subsequence of events A n such that PAjAk ^ p 2 — « 
whatever be the terms y/y, Ak of this subsequence. 

We observe that, if PA n = p, then PAjA k ^ p 2 — « while, if the A n 
are two by two independent, then PAjAk — p 2 . Thus, however small 
be « > 0, for every sequence A n of events, independent or not, there 
exists a subsequence which behaves as if its terms were two by two 
semi-independent: up to e (“semi” only since we do not have necessarily 
PAjAk ^ /> 2 + «). 

A phenomenological interpretation of the foregoing theorem is as 
follows. Consider integer values of time and an incompressible fluid 
in motion filling a container of unit volume. Any portion of the fluid 
which at time 0 occupies a position A of volume PA = p > 0 occupies 
at time m a position A m of same volume PA m = p. The theorem says 
that, for every « > 0, the portion occupies in its motion an infinity of 
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positions such that the volume of the intersection of any two of these 
positions is ^ /> 2 — e. In particular, if the motion is “second order sta¬ 
tionary, M that is, PAjAj^k = PAA^ then it intersects infinitely often 
its initial position 一 this is Poincare’s recurrence theorem (he assumes 
“stationarity”) 一 and the intersections may be selected to be of volume 
^ p 2 — e —— this is Khintchine’s refinement. 

4. Chain dependence. There is a type of dependence, studied by 
Markov and frequently called Markov dependence, which is of con¬ 
siderable phenomenological interest. It represents the chance (random, 
stochastic) analogue of nonhereditary systems, mechanical, optical, 
whose known properties constitute the bulk of the present knowledge 
of laws of nature. 

A system is subject to laws which govern its evolution. For example, 
a particle in a given field of forces is subject to Newton’s laws of mo¬ 
tion, and its positions and velocities at times 1, 2, …， describe the 
“states” (events) that we observe; crudely described, a very small par¬ 
ticle :in a given liquid is subject to Brownian laws of motion, and its 
positions (or positions and velocities) at times / = 1, 2, …， are the 
“states” (events) that we observe. While Newton’s laws of motion are 
deterministic in the sense that, given the present state of the particle, 
the future states are uniquely determined (are sure outcomes), Brownian 
laws of motion are stochastic in the sense that only the probabilities of 
future states are determined. Yet both systems are “nonhereditary” in 
the sense that the future (described by the sure outcomes or probabili¬ 
ties of outcomes, respectively) is determined by the last observed state 
only—the “present.” It is sometimes said that nonhereditary systems 
obey the “Huygens principle.” The mathematical concept of non- 
heredity in a stochastic context is that of Markov or chain dependence, 
and appears as a “natural” generalization of that of independence. 

Events y/y, where j runs over an ordered countable set, are said to be 
chained if the probability of every Aj given any finite set of the preced¬ 
ing ones depends only upon the last given one; in symbols, for every 
finite subset of indices j\ < jky we have 

P A h A i 2 --A ik _^j) = P A ik J^j k ). 

Classes Qj = • • •} of events are said to be chained if events 

Ajk selected arbitrarily~one in each 6, — are chained. 

An elementary chain is a sequence of chained elementary partitions 
A nk = Q, w = 1, 2, • • • ； in particular, if X n = YL x n/ J Ank with 

k k 
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distinct x nU x n2} - - - are elementary random variables, then the X n 
are said to be chained^ or to form a chain ，when the corresponding par¬ 
titions are chained. 

It will be convenient to use a phenomenological terminology. Events 
of the wth partition will be called states at time w, or at the wth step, of 
the system described by the chain. The totality of all states of the 
system is countable; we shall denote them by the letters j y k y h y …, 
and summations over, say, states k will be over the set of all states, 
unless otherwise stated. 

The evolution of the system is described by the probabilities of its 
states given the last known one. The probabilities PJk n of passage 
from a state j at time w to a state k at time w + ” (in ” steps) form a 
matrix P"*'**. Since “probability given /’ is a probability, and the 
probability given j at time m to pass to some state in n steps is one, 
we have 

P?k n ^ o, E p^ n = 1. 

k 

Furthermore, by the definition of chain dependence, the probability 
giveny at time m to pass to state k \nn n' steps equals the probability 
given j at time m to pass to some state in n steps and then to pass to 
k in n' steps, we have 


pm,n-\-n , ^ pm,np»M-n f n / 


or, in matrix notation, 


jpm t n-\-n , = pm,npm-\-n t n , 


An elementary chain is said to be constant \ f P^ n is independent of m 
whatever be j, k, and n. Then we denote this probability by PJ ki and 
call it transition probability from, j to k in n steps. The corresponding 
matrix P n is called transition matrix in n steps; \{ n = \ we drop it. 
The foregoing relations become the basic constant chain relations: 


nn EH P?k +n， = E P?kPL 

k h 

The last one can also be written as a matrix product P n+n， = P n P n '. 
Hence P n is the wth power of the transition matrix P = P 1 , so that P 
determines all transition probabilities. In fact, for an elementary chain 
to be constant it suffices that the matrix P" 1,1 be independent of m\ 
P" 1,1 = P } since then 


pm t 2 = pm t lpm+l t l = p2 pm,3 = pm t 2pm~h2,l = pS 
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We observe that in PJ k and in every symbol to be introduced below, 
superscripts are not power indices, unless so stated. 

JVe investigate the evolution of a system subject to constant chain laws 
described by a transition matrix P. In particular, we want to find its 
asymptotic behavior according to the state from which it starts. In 
phenomenological terms the system is a non hereditary one subject to 
constant laws (independent of the time) and we ask what happens to 
the system in the long run. The “direct” method we use — requiring 
no special tools and which has a definite appeal to the intuition — has 
been developed by Kolmogorov (1936) and by Doblin (1936, 1937) 
after Hadamard (1928) introduced it. But the.concept of chain and 
the basic pioneering work are due to Markov (1907). 

*5. Tjrpes of states and asymptotic behavior. According to the total 
probability rule and the definition of chain dependence, the probability 
QJk of passage from j to k in exactly n steps, that is, without passing 
through k before the wth step, is given by 

Qjk = IZ PjhiPhihi … Pkn-lk- 

hi 9^k 9 ht 一灰 , • • 

The central relation in our investigation is 
⑴ = Z QTkPlk m y « = 1,2 , …， 

m*=l 

the expressions P° kll = 1 (obtained for m = n) are the diagonal elements 
of the unit matrix P°. 

The proof is immediate upon applying the total probability rule. 
The system passes from j to k\n n steps if, and only if, it passes from j 
to k for the first time in exactly m steps, w = 1, 2, and then 

passes from 走 to 走 in the remaining n — m steps. These “paths” are 
disjoint events, and their probabilities are given by QJkPkk m - 
Summing over n = 1, 2, …， iV, the central relation yields 

= Z i ： QTkPlk m = E (QTk E Plk n ) 

h= 1 n»l tn^l nssxin 

and, therefore, 

N N N N-N ， N' 

(i + 0+ E Ptk)ZQ% N. <N 

H = 1 ?7|3=1 fl^l n»l fTlal 

N 

It follows, upon dividing by 1 + 52 P%k and letting first iV —> « and 



then N f —* <», that 
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is the probability, starting at j } of passing through k at least once; for 
k = j it is the probability of returning to j at least once. More generally, 
the probability q% starting at j, of passing through k at least n times is 
given by 

00 

qlk={HQTk)qlk l = qikq^ 1 . 


In particular, the probability qfj of returning to j at least n times is given 

by 

= <imir x = (?") 2 4 -2 =...= ( 仿 ) ' 

Its limit, 

(4) rjj = lim (qjj) n = 0 or 1, according as <1 or qjj = 1, 

n —+ * 

is the probability of returning to j infinitely often. It follows that the 
probability, starting at j, of passing through k infinitely often is 

r jk = lim q% = qjk lim ql^ 1 = qjkTkky 

n 一 * °o n 一 ► * 

so that 


(5) r jk = 0 or qj k) according as q k k <1 or q k k ^ *• 

Upon singling out the states j such that qjj = 0 (noreturn) = 

(return with probability 1), we are led to two dichotomies of states: 


j is a return state or a noretum state according as qjj > 0 or in — 0;j is 
a recurrent state or a nonrecurrent state according as qjj = I <>•* qa < 1 
or, on account of (4), according as r" = 1 or ry, = 0. 
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Clearly, noreturn states are boundary cases of nonrecurrent states and 
recurrent states are boundary cases of return states. In terms of tran¬ 
sition probabilities, we have the following criteria. 

Return criterion. A state j is a return or a noreturn state according 
as Pfj > 0 jor at least one n or Pfj = 0 jor all n. 

This follows at once from the fact that 

00 

sup Pj k ^ qjk ^ 12 

^ n»l 

Recurrence criterion. A state j is a recurrent or a nonrecurrent 

oo 

state according as the series YL PJj divergent or convergent. 

n* 1 

This follows from (3). 

Less obvious types of states are described in terms of “mean fre¬ 
quency of returns,” as follows: 

Let vjk be the passage time, from j to k y taking values m = 1, 2, •. 
with probabilities Q% If q,k = 1, then v,k are elementary random vari¬ 
ables. If < 1, then, to avoid exceptions, we say that vjk = •» with 
probability 1 — qjk- The symbol <» is subject to the rules 

— = 0, oo -f- f = °o, and oo X f = °o or 0 according as f > 0 or f = 0. 
00 J ~ 

We define the expected passage time ry* from j to k by 

00 

Tjk = H f»Qjk + «(1 — qjk )； 

m=»l 

we call rjj the expected return time to j and the mean frequency of returns 

.. 1 
to j is 一 . 

W 

We can now define the following dichotomy of states. A state j is 

null or positive according as —- = 0 or — > 0. Clearly, a noreturn 
* T ii T /i 

and, more generally, a nonrecurrent state is null while a positive state 

is recurrent. 

We shall now establish a criterion for this new dichotomy of states 
in terms of transition probabilities. To make it precise, we have to 
introduce the concept of period of a state. 
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Let j be a return state; then let dj be the period of the that is, 
the greatest integer such that a return to j can occur with positive 
probability only after multiples of dj steps ： QJ} = 0 for all w 〆 0 (modulo 
dj) y and > 0 for some n. Let dj be the period of the PjJ defined 
similarly. We prove that dj = ^ and qall it the period (of return) of j. 

The proof is immediate. If > 0, then ^ > 0 so that 

Jj ^ dj. Thus, \ f dj = 1, then dj = 1. If dj > 1 and r = 1, •••,</,• — 1, 
then the central relation yields 

巧 = 0 , Pfj +r = ^ = 0 , 

P]f i+T = = 0, etc. • • *, 

so that dj ^ and, hence, dj = 

If j is a noretuirn state, then we say that its period is infinite. 

Positivity criterion. A state j is null or positive according as 
lim sup Pfj = 0 or > 0. 

n —♦ « 

More precisely，if j is a null state，then PjJ —> 0, and if j is a positive 
state，then —> — > 0, while Pjj = 0 for all n Xi {modulo d } ). 

• “ T jj * ' 


Since the proof is involved, we give it in several steps. 

1° If j is nonrecurrent, then it is null and, by the recurrence cri- 

oo 

terion, the series 52 Pjj converges so that —> 0. 

n * 1 

If j is recurrent, then, by definition of its period Pjj = 0 for all 


w 〆 0 (modulo dj). Therefore, it suffices to prove that, if j 


it. 

y 

T jj 

and if j is positive, then 


d 丄 


rent, then —> — ; for, if j is null, then — = 0 implies — = 0, 

T )J T 33 t 33 

di 


> 0 . 




Assume, for the moment, that, if the period dj of the positive recur¬ 


rent state j is 1, then Pfj 


T ii 


In the general case, take dj for the 


unit step and set P' = so that P ；； = P^; hence = Q^. Then, 
since ryy = 52 nQ’J]= 平 ， the assertion follows by 

n * 1 uj 

Pt = - 7 = -• 

» T a 
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Thus, it suffices to prove that, if j is recurrent with period dj = 1, then 

巧 ― 丄 . 

T ii 

2° Let j be recurrent with Jj = 1. To simplify the writing, we drop 
the s.ufc»scripts j and, to avoid confusion with matrices, we write super¬ 
scripts as subscripts. We follow now Erdos, Feller, and Pollard. 

Lfet a = lim sup P n st that there is a subsequeftce n f of integers such 

00 

that P n > —> a as —> oo. Since q = Qm = ^ follows that, given 

m = 1 

00 

« > 0, there exists n t such that, for n ^ n ty 12 Qm < ^ Therefore, 

tn*n4*l 

for It n ^ n € and every p < with Q v > 0, the central relation 
yields 

Pn* ^ QpPn f —p + 52 QmPn 9 ^ m + t* 

m 彡 n,m^p 


Since for n' sufficiently large, P n > 
m ^ w, it follows that 


< a + « for 


hence 


« ^ QpPn'-p + (1 — Qp)(oC + «) + « 


a + € 


< Pn’ — p < a I 


Therefore, letting « and then « —> 0, we obtain P n >- p — > 

and, repeating the argument, we have, for every fixed integer m, 

Pn f -mp a as n' -> oo. 

3° Let us assume, for the moment, that Q\ > 0 so that P n ，_ m —> a 
for every fixed m. We introduce the expected return time r and use 

oo 

the fact that j is recurrent, so that, setting q n = X] Qm 、 we have 

" m*n + l 

y 0 =l. The expected return time r can be written 

00 00 00 

T — ^ : tllQm = > : 一 7”‘) = 〉: q 、 n 、 

m = 1 m ■* 1 饥 *0 

and the central relation can be written 


m* 1 


一 Qm) P 


〉: QmP «- 

m = 0 


so that 
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n 

〉 n —m = 1 
in * 0 

and, letting oo and then ” ①， we obtain a ^ 1/r. If r = <», 

then a = 0; hence P n 1/r = 0. Thus let r < oo. The same 

argument for 0 = lim inf P n shows that, for a subsequence n n such 

n —» « 

/3 as n" —> oo, we have P n "—m —* 0 for every fixed m 
X! QmPn'-m + € ^ 1 for 

m»* 0 

1 1 

it follows as above that /3 ^ . Therefore, P n —> —, and the assertion 

r r 

is proved under the assumption that Q\ > 0. 

4° To get rid of the last assumption, we appeal to elementary 
number theory. Consider the set of all those p for which Q p > 0. It 
contains a finite subset \pi\ whose greatest common divisor is the 
period </( = 1). As above, if P n > —> a, then P n /_ m<p< —> a for every fixed 
nn and and it follows that P n '—m « for every fixed linear combi¬ 
nation w = 52 m ipi' But every multiple of the period md = m ^ IJ pi 

i i 

can be written in this form, so that, starting with n* sufficiently large, 
Pn'-m a for every fixed m } and the assertion follows as above. 
This concludes the proof. 

Since, for a state j with period dj there exists a finite number of inte¬ 
gers pi such that Pfj > 0 and, for m sufficiently large mdj = 52 饥 ipi ， 
it follows, by Pf s ^ II PJ} iPi > 0, that 

If dj is the period of j、then > 0 for all sufficiently large values of m. 

In other words, after some time elapses the system returns to j with 
positive probability after every interval of time dj. 

We can now describe the asymptotic behavior of the system. If k 
is a return state of period <4> set 

qikir) = E r-l,2,”. ， 4, 

7ft "0 

so that qjk{r) is the probability of passage from j to k\n n = r (modulo 
<4) steps and 

dk 

Hqjkir) = qjic. 


that P n ,f ― ^ 
and, from 
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Asymptotic passage theorem. For every state j 
if k is a null state’ then PJ k —> 0; 
if k is a positive state, then P^ k+T 
and, whatever be the state k } 

- ZP? k -^P }k = ~ 

^ m = l T^k 

The theorem results from the positivity criterion and the central re¬ 
lation, as follows: 

If 是 is a null state, then Plk 0. Therefore, 

P?k^ZQTkP n kk m + Z Q%y 

and it follows, upon letting » —> « and then n' —> that Pj k —> 0. 

If 走 is a positive state, then P^ k+r = 0 for r < <4 and P^ k -> d k /r kk . 
Therefore, from 

n / n 

o ^ P]i k+T - £ QJk k+r P ( ^~ m)dk ^ Y, Qlk k+r 

wi=»l m=B>n , -)-l 

it follows, upon letting n — 的 and then n' —> «, that P^ k+r —>• 
9ikir)dk/Tkk- 

The last assertion follows from the first two assertions. 

*6. Motion of the system. To investigate the motion of the system 
we have to consider the probabilities of passage from one state to an¬ 
other. But, first, let us introduce a convenient terminology. 

A statey is an everreturn state if, for every state k such that qjk > 0, 
we have qkj > 0. Two states j and k are equivalent and we write j 〜走 
if qjk > 0 and > 0; they are similar if they have the same period 
and are of the same type. A class of similar states will be qualified ac¬ 
cording to the common type of its states. 

A class of states is indecomposable if any two of its states are equiva¬ 
lent, and it is closed if the probability of staying within the class is one. 
For example, the class of all states is closed but not necessarily inde¬ 
composable. 

The motion of the system is described by the foregoing asymptotic be¬ 
havior of the probabilities of passage from a given state to another 
given state, and also by the following theorem. 


9ik(r) 


A 

Tkk 
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Decomposition theorem. The class of all return states splits into 
equivalence classes which are indecomposable classes of similar states. 

A not everreturn equivalence class is not closed. An everreturn equiva¬ 
lence class is closed; if its period d > 1， then it splits into d cyclic sub¬ 
classes C(l), C(2), .. • ， C{d) such that the system passes from a state in 
C(r) to a state in C(r + 1) {C{d + 1) = C(l)) with probability 1. 

The proof is simple but somewhat long. To begin with, we observe 
that, if j and k are two equivalent states, distinct or not, then there 
exist two integers, say m and p, such that PJk > 0, PI, > 0. 

1° The set of all states which are equivalent to some state coincides 
with the set of all return states. For, on the one hand, every return 
state is equivalent to itself and, on the other hand, if j ~ k, then ^ 
pm+p ^ > 0. Thus, the relation j 〜走 ， symmetric by definition, 

is reflexive: j 〜 j. It is also transitive, for j 〜 k implies Pjl > 0, 
k 〜 h implies Pl h > 0 for some integer n and, hence , 你 g PJh +n ^ 
PTkPlih ^ 0; similarly for q^j. Therefore, the relation ) 〜走 has the 
usual properties of an equivalence relation and the set of all return 
states splits into indecomposable equivalence classes. 

We prove now that, if j ~ k, then they are similar. We know al¬ 
ready that they are both return states; let dj and d k be their respective 
periods. There exists an integer n such that P%k > 0; hence P** ^ 
PlkPlk > 0 and PTi +n+v ^ PTkPlkPli > ^ similarly, Pf^ +P > q 
Therefore, Jy, being a divisor of w + w + p and of w + 2w + p, is a 
divisor of every such n and hence of dk. By interchanging j and k y it 
follows that j and k have the same period. 

If j is an everreturn state and P% h > 0, then, from Pjl +q ^ PTkPlh > 0, 
it follows that there exists an integer r such that P r h j > 0; hence P r h t p ^ 
PljPfk > 0, and k is an everreturn state. By interchanging j and k y it 
follows that they are both either everreturn or not everreturn states. 

If k is recurrent, then, by the recurrence criterion, 

00 00 00 

Pj} +n+p ^ PTk (L Plk) Ph = ^ 

n=I n=l n==l 

and j is recurrent. By interchanging j and k y it follows that they are 
both either recurrent or nonrecurrent. 

If d is the common period of the two equivalent states j and k y then, 
from 

pm+nd+p > pm pndpp 

r jj = r kk r kjy 

it follows that ^ is a divisor of w 十 p and lim > 0 implies lim P'jf 
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>0. Hence, upon applying the positivity criterion and then inter¬ 
changing j and k } both are either positive or null. This completes the 
proof of the first assertion. 

2° If j is a return but hot an everreturn state, then there exists a 
state h such that > 0 while qhj = 0, so that h is not equivalent toy 
and there is a positive probability of leaving the equivalence class of j. 

If j is an everreturn state, then qj k > 0 entails q k j > 0 so that k 
belongs to the equivalence class of j. Therefore, the probability of 
passage from j to a state which does not belong to the equivalence class 
of j is zero and, the class of all states being countable, the probability 
of leaving this class is zero. 

Finally, we split an everreturn equivalence class C of period d > \ 
as follows: Let j and k belong to C. Since Pj} +P ^ PJkPlj > 0, is a 
divisor of m p and, if m\ and w 2 are two values of m y then m\ = w 2 
(modulo d). Thus, fixing 乂 to every k belonging to C there corresponds 
a unique integer r = 1 or 2, • • •, or d such that, if Pf k > 0, then m = r 
(modulo d). The states belonging to C with the same value of r form 
a subclass C(r) and C splits into subclasses C(l), C(2), * • - C{d). It 
follows that, if k and 是 ， belong respectively to C(r) and C(r’)，then 
P^k' can be positive only iov n = \ r — r f \ (modulo d). Moreover, ac¬ 
cording to the proposition which follows the positivity criterion, 

>0 for all such n sufficiently large. Thus no subclass C(r) is empty and 
the system moves cyclically from C(r) to C{r +1) … with C{d +1) 
=C(l). This proves the second assertion. 

Corollary 1. The states of an everreturn equivalence class C are linked 
in a constant chain whose transition matrix is obtained from the initial 
transition matrix P by deleting all those Py* for which j or k or both do 
not belong to C. 

Corollary 2. The states of a cyclic subclass Cif) of an everreturn 
equivalence class with period d are linked in a constant chain whose tran¬ 
sition matrix P' is obtained from P 4 by deleting all those Py* for which j 
or k or both do not belong to C(r). 

Corollary 3. An everreturn null equivalence class C is either empty 
or infinite. In particular^ a finite chain has no everreturn null states. 

Let C be finite nonempty. By the asymptotic passage theorem, 

P% —► 0 for k ^2 C. But C is closed, so that 1 = d — 0 for 

* cc 

j G C, and we reach a contradiction. 
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Corollary 4. If j and k are nonequivalent everreturn states, then 

n = o. 

If j and k are equivalent positive states、with period d、then 
P 货 +r 一 d/r kk for some r = r{j ， k) 

P^t +r/ = Q for r f 7*^ r {modulo d). 

This follows by the asymptotic passage theorem. 

*7. Stationary chains. The evolution of a system is determined by 
the laws which govern the system. In the case of constant elementary 
chains these laws; are represented by the transition matrix P with ele¬ 
ments Py*. While P determines probabilities of passage from one state 
to another, it does not determine the probability that at a given time 
the system be in a given state. To obtain such probabilities we have 
to know the initial conditions. In the deterministic case this is the 
state at time 0. In our case it is the probability distribution at time 0, 
that is, the set of probabilities P } - for the system to be in the state j at 
time 0. Then, according to the total probability rule, the probability 
P\ that the system be in the state k at time n = 1 ， 2, is 

i 

The notion of statistical equilibrium corresponds to the concept of 
stationarity in time. In our case of a constant elementary chain with 
transition matrix P, it is stationary \{ P% = P k for every state k and 
every w = 1 ， 2， .... 

Given the laws of evolution represented by a transition matrix, the 
problem arises whether or not there exist initial conditions represented 
by the initial probability distribution such that the chain is stationary; 
in other words, whether or not there exists a probability distribution 
{Pj) which remains invariant under transitions. In general, one ex¬ 
pects that if, under given laws of evolution, an equilibrium is possible, 
then it is attained in the long run. To this somewhat vague idea corre¬ 
sponds the following 

Invariance theorem. For states j belonging to a cyclic subclass of a 

positive equivalence class with period d y the set of values T 5 / = 一 is an 

a # . r ii 

invariant and the only invariant distribution under the transition matrix 
of the subclass. 

According to Corollary 2 of the decomposition theorem, it suffices to 
consider the chain formed by the subclass, that is, by one cyclic posi- 
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tive class with some transition matrix P of period one. According to 
the asymptotic passage theorem, 


P% 


Pk > 0. 


Since 


nk 


HP?k = l and pn +m = E pn 


it follows, upon taking arbitrary but finite sets of states and letting 
w —> oo, that 

UPk ^ 1 , Pk^T,KPkk- 

k h 

But if, for some k, the second inequality is strict, then summing over 
all states k, we obtain 

U 巧 >1： 八 

k h 


so that, a 彡 contrarioy 


P k = Z PkPT„ 


Since $2 Ph is finite, we can pass to the limit under the summation sign, 

h 

so that, by letting w —> «， we obtain 

八 =(E 

h 

and, J 5 *； being positive, it follows that 52 J 5 * = 1. Thus, the set of 

h 

values Pk is a probability distribution invariant under P. 

It remains for us to prove that, if a set of values Pk has the same 
properties, then P *； = P k . But from 

Pk-Z PkPkk 

h . 

it follows, as before, that P* = (52 Ph)^k = Pk ，and the conclusion is 

h 

reached. 

Corollary. If C is a positive equivalence class，then 




This follows from 


E 

1 d d 

d 23 一 = 52 52 一 

y e C Tjj r -l } C C(r) Tjj 


d. 
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Stationarity theorem. A constant elementary chain with transition 
matrix P is stationary with initial probability distribution {J 5 *} if y and 

only if, Pk = 0 for all null states k, and P k = — for all states belonging 

Tkk 

to positive equivalence classes C t , with ^pt = 1. 

t 

Let the probability distribution {Pk) be invariant under the transi¬ 
tion matrix P so that 

Pk = H n 

i 

If ! 々 is a null state, then, by the asymptotic passage theorem, Pj k 0. 

X) P } - being finite, we can pass to the limit under the summation sign. 
i 

It follows, upon letting w 一 oo, that Pk = 0. Hence, by summing 
over positive states only, ^ Pk = 1. 

If k belongs to a positive equivalence class C ty then, by the asymptotic 
passage theorem, we have that Pf k = Q for every j which does not 
1 n 1 .... 

belong to C t and - $2 Pjk ——for every j belonging to C t . It fol- 

W m= 1 Tfck 

lows that 

'e hPik =x pJ 1 - zpn)-^ — 

i C C| j C Ct \?7 m=B\ / TJqJc 

wnprp 

pt= Z Pi and Zpt = = 1. 

i CCt t 

This proves the “only ir’ assertion. 

Conversely, let the conditions on the Pk hold and use 

pv = hn 

i 

where the summation is over positive states j only, since Pj = 0 for j 
null. 

Therefore, if k is null, then Pj k = 0 and Pf = 0 for every m. If k 
belongs to a positive equivalence class Ct, then, since C< is closed, P% 
= 0 for all states j which do not belong to Ct, and, C\ being a finite 
subclass of C t such that ^2 Pj < t with sum over j C. C t — C\ y we have 

Pt= E ^PTk PTJw + 

iCCt i CC, 

Upon replacing —- by the limit of the mean in the asymptotic pas- 
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sage theorem with subscripts h y j C C ti we obtain, by summing first 
over the y ， 

lim I E PH k +s + C . 

n—> a> 77 « j 

Hence 

pt 

Tkk 

• • Pi 

so that, letting e —> 0, we nave Pf ^ - 

Tkk 

If, for some k y the inequality is a strict one, then, since for null states 
Pk = Oj it follows, by summing over positive states k only, that 

1 = E' ^ < E P* E 7- = E P* = 1 - 

t k^CtTkk t 
Pi ^ 

Therefore, = — for every m, and the “if” assertion is proved. 

^kk 

COMPLEMENTS AND DETAILS 

I. Physical statistics. The problem is to determine the state of equilibrium 
of a physical system, of energy E y composed of a very large number N 
of “particles” of the same nature: electrons, protons, photons, mesons, neu¬ 
trons, etc. 

Hypotheses. There are gi microscopic states of energy e\ y g2 of energy 豸 2, • • • 
and each particle is in one of these states. The macroscopic state, i.e., the 
state of the system, is specified by the number of particles at each energy level: 
vi particles of energy e\ y V2 particles of energy … • The set 卜 1， • • •} is a 
set of random integers and the probability of a macroscopic state v\ = n\ y 
。 = 方 2, • • • is equal, up to a constant factor, to the number PF of ways in which 
tik particles can be distributed amongst gk microscopic states of energy 办 , 走 = 1 , 
2 , …， provided 

Unk = N y X) n ^k = E. 

k k 

The Maxwell-Boltzmann statistics (classical theory of gases) is that of distin¬ 
guishable particles without exclusion, i.e” without any bound upon the pos¬ 
sible number of particles in any of the microscopic states. The Bose-Einstein 
statistics (photons, mesons, deuterons, • • — particles with an integer “spin”) 
is that of nondistinguishable particles without exclusion. The Fermi-Dirac 
statistics (electrons, protons, neutrons — particles with a semi-integer “spin”) 
is that of nondistinguishable particles which obey the Pauli exclusion principle, 
that is, there cannot be more than one particle in any of the microscopic 
states. 
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Weights. Let w denote the weight of the macroscopic state { n\ y • • •} i.e., 
JV/N\ in the distinguishable case and JV in the nondistinguishable case. Prove 
that the combinatorial formulae give the following expressions for w, where it 
is assumed that $2 $2 n k^k = E (in the case of photons N is not fixed 

k 

and only the second condition remains): 



Distinguishable 

Particles 

Nondistinguishable 

Particles 

Without exclusion 

功 = n^/n».! 

(Maxwell-Bol tzmann) 

TT (Si + ”< 一 1)! 

te； =& 1 1 - 

i 丄 》，•! (公一 1)! 

(Bose-Einstein) 

With exclusion 

m _ TT 办！ 


— u »i\{gi - »i)\ 

(corresponds to no 
physical reality) 

— U - mV- 

(Fermi-Dirac) 


When gk then the expressions of the weights in B.-E. and F.-D. statistics 
are equivalent to w in M.-B. statistics. Assume distinguishability and let c be 
the “capacity” coefficient of the microscopic states, that is, if there are already 
n particles in the states of energy ek(k = 1, 2, • • •)，the number of these gk 
states which remains available for the {n + l)th particle is gk — — this is 
Brtllouin statistics. The weights w of the macroscopic states, previously defined 
zs w = W/N\ are given by 

1 

w = XI ― , gk(gk — c) • • • [取一(办 一 l)r] 

k nitX 

and reduce to those of M.-B., B.-E., and F.-D. by giving to the parameter c 
the values 0, —1, +1 respectively. 

Statistical equilibrium. For a very large N the equilibrium state of the macro¬ 
scopic system is postulated to be the most probable one, that is, the one with 
the highest weight.. Assume that Stirling’s formula can be used for the fac¬ 
torials which figure in the table of weights above. Take the variation S log w 
which corresponds to the variation hn 、•••}. Using the Lagrange multi¬ 
pliers method, the state which corresponds to the maximum of w is determined 
by solving the system (prove) 

S log w + X • SN + ix • 8E = 0 
Jl»k = N, 2Z »ktk = E. 

k k 

(In the case of photons take 入 = 0 and suppress the second relation.) The 
equilibrium states for the various statistics are also obtained by replacing c by 
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0，— 1, and 1 in the equilibrium state for Brillouin statistics, given by 

»k = gk/(f X+>tek + c) 

where X and /x are determined by the subsidiary conditions 

N=J ： + c ) t E = Z W(^ +Me * + c). 

k k 

The Planck-Bose-Allard method. The macroscopic states can be described in 
a more precise manner. Instead of asking for the number rik of particles in the 
states of energy we ask for the number gkm of states of energy eh occupied by 
m particles. The particles are assumed to be nondistinguishable as required by 
modern physics. The combinatorial formulae give 

W = life ! /n^n!) with gk = Skm, N = mg km , E = emg km . 

km m km km 

To obtain the statistical equilibrium state use the procedure described above. 

B.*E. statistics is obtained if no bounds are imposed upon the values of m. 
F.-D. statistics is obtained if m can take only the values 0 or 1; “intermediate” 
statistics is obtained if m can take only the values belonging to a fixed set of in¬ 
tegers. 

In the equilibrium state (with r = — 1 or +1 when the statistics are B.-E/s 
or F.-D/s respectively), we have 

gkm = ^(1 + cak)^'cak m y where a k = 

and gk(u) y determined by the usual subsidiary conditions, the generating function 
of the number of particles in a microscopic state of energy 办 ， is 

gk(u) = (1 + cak)^ 0k/c -(l + ca k u) gk/c . 

II. The method of indicators. 

/• Ruk: In order to compute PB f B = A\ c y ••♦ ， A my A m c ) y take the 
following steps: 

(a) Reduce the operations on events to complementations, intersections, and 
sums; 

(b) Replace each event by its indicator, expand, and take the expectation. 

In this way find 

P( U Aj) and P{ fl Aj U Af) in terms of P( fl A^s. 

/■I / 麵 A:+l f 麵 1 

m 

Notations. Let 7^. = /,* and let R = $3// be the “repetition” of 為 ’s, that 

• /画1 

is，the number of events Aj which occur. Let / 0 = 1, / r = 22 Ij^ • ， Ij r where 
the summation is over all combinations 1 S < ji< jr S 亂 Let /[ r 】 
and 7<r) be indicators of* the events exactly r A 9 s occur and at least r A y s occur, 
respectively; set 

Sr ^ EJry P[r] = EI[ r \ y P(r) = El ( r ). 




2. Prove 
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m m 


(a) 

X) « r ^[r]= 

= 52 (« — !)*/* = u R , 

and deduce 

r-0 

8 =«0 

(b) 

P lr] = 

f ： (-l) k - r C k r S k , 

k 職 r ^ 


m 

m 

(c) 

^ = 52 ^tP[t] = S CtIlP( t )y 

t 載 k t^k 

(d) 

R(R - 1) - 

■■(R-k-i-l) = k\J k 


3. Let k ^ r ^ m. Using 2{c) and the relations 


prove that 


/<m> ^ ^ /(r) ^ /(r-l) ^ * * * ^ I(k) y 

(S k - an — CLl) ^ P (r) ^ S k /Cl 


Examine the special case r — m\ the left-hand side becomes Gumbel’s inequality; 
the right-hand side becomes Fr4chet’s inequality. 

Let 

m = i - h/c k m , am =/(k + D ~/{k). 

4. Prove 

-k 


(a) 

(b) 

(c) 


m — I /~>t — 

△ 朋 - 


hr] ^ 


c r m 




A; — 1 


4/( 是 )， k ^ r m — 


AJ(k) ^ 0 ； 


deduce a scale of inequalities for the Sk's. 
5. 


(a) 

(b) 

(c) 


■mA 


— i 


(1 — 1(d), 


(M) = "g 1 
V k) h a-j 

cL~-i / ▲ m\ 

— -― I, 


一 / ⑴ g 


-k 

-k- 


( 


■mA ■ 


/ 


w △孕 g 0; 

k 


deduce another scale of inequalities for SkS. 

6. The general symbolic method. The events B\ y • • •, B m are called exchange¬ 
able if 户 (5 tl ••… Bi r 5 lr+1 c ♦ •. 5 c 1r+ ,) depends only on the number r of events 
Bi and on the number s of events Bi c . 

Let 


Jr/, = E /(A) … /(A r )/(U … /(A+/) 
Sr/., = E(J r /,) = 22 P (山 i ... AA+1 C … A 

Prh = 户 ( 5 “ ... B ir Bi … c ... B ir+t c ). 
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If we choose the B/s such that 


then 


L ••• A ir A ir ^ • • • A ir J) = Z P(B h ... B ir B ir 彳 … Bir j), 


Pr/s = S T f 9 / 


If we further introduce symbolic independent events having the same prob¬ 
ability p y the complementary events having probability y = 1 — 多， then, sym¬ 
bolically, 

Pr/s = P r q 8 . 


The symbolic method consists of the following steps: 

(a) In any given identity (or identical inequality) for p，g (0 ^ p ^ l 
穿 =1 一 p) replace p r g 8 by p r / 8 . 

(b) Replace p r / 8 by Sr! 9 /C r m C s m ^ r and obtain an equality (or inequality resp.) 
for the S r / 8 9 s. 


Examples: 

(a) Starting from p r cf = p r (l — p) 8 obtain 


Sr/s 




in the special case r + s = m y find 

Sr/m—r = 户 M. 

(b) Starting from p r q 8 = p r (f{p 4 - g) m ~ r ~ s y obtain 


S r /s — 2^ ClCh 一 

i^T 

In the special case s = 0 y find 

S r /0 = Sr = • • • • 

(c) Starting from p r ， cf f ^ p r q 8 y r f ^ r y s f ^ s y find 

^ J r 'r ， r' ^ r, s' ^ s, 

and as a special case the scale of inequalities (4c). 

r 

(d) Starting from 1 ^ C?p r -V where 22’ denotes a sum in which a certain 

i = 0 

number of terms is omitted, find 

Y ： S T 一 Xli 

and, taking only the terms i = 0 and / = 1， find the second scale of inequalities 
(5c). 

7. The classical problem of matching. This problem (probleme des rencontres) 
was studied first by Montmort (1708) and further treated by Lambert, Euler, 


S r，f s ， 
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and others in different forms, all of which can be described by the following 
setup: given m distinct numbers X\ y X my choose at random a first X ily 

then a second X i2 from the remaining ones, etc. A match (coincidence, ren¬ 
contre) is an event Ai which consists in choosing exactly at the /th draw. 

In the following, assume that each permutation - - has the same 

probability of being chosen at random. Show that 


尸 ( 成， 


{m — r)\ 


and S r 


(-D- 


(c) Find lim P【 r 】； interpretation? Show that P[ m -i] = 0; interpretation? 

m qo 

(d) Show that E{t) = rP[ r j = = 1 and E[r — £(r)] 2 = 1. (Use the 

m 

generating function « r ^P[r].) 

HL Random walk* A particle starting at some point of an m-dimensional 
space moves in such a way that its consecutive displacements can be repre¬ 
sented by independent 历 -dimensional random vectors. Problems of the fol¬ 
lowing type arise: find the probability that in time T or before time T the par¬ 
ticle reaches a certain domain Z), or that it reaches D without having reached 
previously a domain D\ or find the expected time for the particle to reach D y 
etc. • • • 

We give a few examples which show the great variety of forms under which 
this problem occurs, questions which can be asked, and methods of solution. 
We restrict ourselves to the discontinuous case with every move taking one 
unit of time. 

/• Game of “heads or tails" and combinatorial method. To n tosses of a coin 
with equal probabilities for heads and for tails we associate the score point whose 
coordinates are respectively the number of heads and the number of tails which 
occur. Thus, at every toss, the score point M moves by one unit either upwards 
or to the right, and the game is represented by a two-dimensional one-sided 
random walk on the lattice of points with integer coordinates. 

The score points corresponding to the same number n of tosses lie on the line 

x + y = rt. The total number of paths between 0 and M = (a y b) is -- - : * 

' a\b\ 

(a) If A and B ran for office, A got a votes and B got b < a votes, find the 
pr. P that in counting the votes A be always ahead of B. 

(Equivalent to the pr. that the score point stays below the bisectrix until it 
reaches the point M = (a y b). Compute the pr. of the complementary event by 
applying the symmetry principle of Desire Andre as follows: the paths from 0 
to M which intersect the bisectrix either go through (1，0) or through (0, 1). 
By reason of symmetry both classes contain the same number of paths. The 
number of those which go from (0, 1) to M Is (^ + ^ — \)\/a\{b — 1)!, and 


a b 
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(b) The probability that there be neither gain nor loss in exactly 2» tosses is 

1*3 • • • (2w 一 3) 1 /ri . , , , ^ 

2.4.6 . ~2n 〜 2 /一 • ^ tart the number of paths from 0 to 

(w, n — \) which do not intersect the bisectrix.) 

(c) The probability that the gambler who bets on heads and whose fortune is 
m times the stake loses his fortune in m + 2n tosses is m{m + » + 1) • • • 
{m + In — l)/2 m+2n w!. (Reduce to (a) by taking for origin the point {m + n y n).) 

2. Gambler s ruin. 

(a) Method of difference equations. Consider a one-dimensional random walk 

on the lattice x = 0, zfcl, ±2, ♦ ♦ ♦• At each step the particle B.ty has probability 
Pk to move from y to j + 是，走 = 0, 士 1 ，土 2， • • Let P x be the probability 

of ruin, that is, starting at x with 0 < x < a to arrive at j ^ 0 before reaching 

^ ^ - Then P* = PyPx-y with boundary conditions P y = \ \f y S 0 and 

P y — 0 ify ^ a. 

The gambler has x dollars and wins or loses one dollar with respective proba¬ 
bilities p and q = \ — p. Find the probability P x of his ruin. Find the proba¬ 
bility P xn of his ruin at the nth game. 

In the first case, P x = pP z +i + qPx-i with P 0 = 1, P 0 = 0. The solution 

^ n 、qlpY - (g/p) x c ， n 、 x r 

is r x = — -:— tor p 9 ^ q and 尸 * = 1 - for p = q. 

vl/P) — 1 a 

In the second case P*, n +i = pPx^i t n + qPx-i，n with P on = Pan = 0 and 
Poo = 1, P X o = 0. The solution is 

Pxn = cos n "~ 1 — sin — sin 

a a a 

(b) Method of matrices. Same random walk but with pi = p-\ = 1/2. The 
particle starts from 0 and dies when it attains a — l$0or》= a + rg — 1. 
Find the probability P n that after n displacements the particle is still alive, as 
follows. 

Set g(k) = 1/2 for 是 = 士 1 and g(k) = 0 otherwise. Then P n = S 《 ( 是 l) • • • 

h 

g(k n ) where the sum is taken over all 々 ’s such that a ^ ^ b y h — \ y 7 y 

• 卜 i 

Set dj = k\ -\ - h kj — a. Then P n is the sum of the elements of 

the (1 — a)-th column or row of the matrix A n where 


^ = (g(j - h)) 


I 0 I 0 
0 * 0 4 


The proper values 入 / of J are given by X/ = cos — —, the proper values of A n 
are 入 ,. n ，and 

n 2 容 ， .irj . 7T；(1 - a) _ irj 
k = 7 + 2 ^ cos 7+i sln T+2- cot 7+2 , 

where 】二 ， denotes summation over the odd fs only. 
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IV. Geometric probabilities. 

Elementary probabilities. Consider an w-dimensional space of points 
U = («i, •••,««) and let G be a group of transformations of points into points. 
If there exists a differential element dn = g(ui, .. u n )dui - - - du n determined 
up to a constant factor by the property that its integral over domains is in¬ 
variant with respect to the group G, dn defines up to a constant factor an ele¬ 
mentary probability. The constant factor is determined by fixing a domain Do 
within which all considered domains lie and by assigning to this domain the 

pr. one; that is, by setting c f dn = 1. Then the points are said to be taken 

or thrown at random in Do. To say that several points are taken or thrown at 
random means that the throws are stochastically independent; in other words, 
we make repeated trials. 

Let M with or without affixes be points in an m-dimensional euclidean space 
and let ^i, • • x m . with same affixes, if any, be its cartesian coordinates with 
respect to a fixed orthogonal frame of reference. The group G which transforms 
points M into points M is the group of euclidean displacements (preserves euclid¬ 
ean lengths). This means that the probability is required to be independent 
of the choice of the frame of reference. Prove that d\i = c dx\ 如 … dx m . 

Let us now investigate straight lines in a euclidean plane determined by their 
equations U\X\ + ^ 2^2 = 1 in rectangular coordinates, and let G e be the group 
of euclidean displacements in the plane. Prove that d\i = c(u\ 2 + du\ du^ 

or, using the normal equations: x\ cos 0 sin 0 — /> = 0, = cdp d6. 

(The transformations of the group G e are of the form x\ : =a\ + x\ cos a 
一尤 2 sin a, x f 2 = + X\ sin a + ^2 cos a and induce transformations of a 

group G on the plane («i, « 2 ) defined by 

U\ == («’i cos a + s\n ot)/{a\u\ + 奶 《’2 + 1 )， 

«2 = (一 《’i sin a + u\ cosa)/(aiu\ + 勿 《’2 + 1). 

The invariance condition yields 

丄 ,. •,_ 、 D («l ，《 2) ... . D(U U U 2 ) («1 2 + «2 2 )^ 


gWl, «'2) = g(Ul, « 2 ) 


D(u\, U\) 


D(u\, u\) W + u'^ 


With the same group G e there is no elementary probability for circles In the 
plane. But there is one for circles of fixed radius.) 

Points on a line. The elementary probability for a point M on a segment 
[0, /] is dx/L Throw n points at random on the segment. The probability, 


say, that there be no thrown points on [0, x] is 


What Is the ex¬ 


pected distance of the nearest to 0 of the thrown points? What is the proba¬ 
bility that k out of the n thrown points lie on a fixed subinterval of length a? 
Find what happens as / -> oo with n/l —> X > 0. Denote then by Mi, M 2 , … 
the points in the nondecreasing order of their distance to 0. What is the ele¬ 
mentary probability for the length to be between x and x + dx and 

what is the expectation of this length? 

Lines in a plane. The elementary probability of a straight line x cos 6 + 
y 9\nd — p = 0 thrown on a plane is 办 =cdp d6. The integral Jjp d6 over a 
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domain induced by a family of straight lines Is said to be the measure of the 
family. The measure of the secants of a segment of length / is 21. (6 varies 

from — - to + - while/) varies from 0 to l cos 6 for every fixed 6.) The measure 


of the secants of a polygonal line of length / is 2/, provided every secant is counted 
as many times as there are points of intersections of the secant with the polyg¬ 
onal line; in particular, the measure of the secants of a closed convex polygon 
is its perimeter. The same is true for the secants of a curve formed by a finite 
number of analytic arcs. Prove it directly for the secants of a circle. 

Let C and Co be two closed convex curves of respective lengths / and k with 
C being interior to Co. The probability that a secant of Co be secant of C is 
///o. • \ 

Application to the needle problem. If C 0 Is a circumference of radius r/2 and C 
is a segment of length /, then p = 2//7rr. Throw the figure formed by the cir¬ 
cumference and the segment on a plane with parallel equidistant straight lines 
with common distance r. The probability that one of these lines intersects the 
segment is 2// irr. Prove it directly by throwing a needle of length / on this 
plane. 

(The position of the needle AB is determined by the coordinates x y y oi A 
and the angle a that AB makes with Ox y one of the equidistant lines. The 
elementary probability Is dx dy da. It is not a restriction to assume 6 between 

2 / r ^/ 2 

0 and t/ 2, ^ = 0, and j between 0 and r. Then p = — I sin a da) 

ttJq 

A differential method. Let Do be a domain of the plane on which are thrown 
at random n points. Intrinsic properties of the figure formed by the points are 
defined independently of Do； for example, M\M 2 < /, triangle has 

acute angles, …. 

The probability of an intrinsic property is given by P = a/s n where s is the 
area of Do and a represents the measure of the set of favorable cases. Let T)\ 
be a new domain containing Do and let P + AP = (a + Aa)/(s + Aj) n be the 
new probability of the same property. If Pk is the probability of the property 
when n — k points are in Do and k points are in Z/o — Do y then 


a + Aa = a + a\ - h 办 = 是!(” _ 是)! PkS n ^ k (As) k 

and 

(s + As) n AP = n(Pi — - 1 △:+••• 

+ {Pk- P)s n ~ k (As) k + •••+(Pn- P)(As)\ 

Keeping infinitesimals of first order, we have 

8P = w(Pi - P) j- 

where n is the number of points thrown at random on Do, P is the probability 
of the property, Pi is the probability of the same property when 1 point is 
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thrown at random on an increment of Do of area 8s y and n 
at random on Do. More generally, 

8m ― n{m\ — m)— 


1 points are thrown 


where m is the expectation of a function of the thrown points, and the other 
quantities are defined similarly to what precedes. The method and the for¬ 
mulae apply whatever be the number of dimensions of the space. 

Application. Two points M\ and M 2 are thrown at random on a segment 


of length I. The probability that ^ is 


2x x 2 




What happens when 


the segment is replaced by a circle of radius r? Find EM\M^ in both cases. 


V. Bernoulli case and Weierstrass theorem. Consider the Bernoulli case 
(with PA x \n lieu of />): 0 ^ ^ 1, 


P(Sn = k) == pnk(x) = 是 !(” 二是 )! #(1 一 ^ = 0 , 1 , 


(a) 


Y, Pnk(x) = 1 ， ES n = Y, ^Pnk(x) = 1, 

fc-0 


cr 2 S n = E (走一 nx、 2 pnk{x) = nx(l — x). 

Jk-0 

(b) Let / be a real or complex-valued continuous function on [0, 1]. It is 
bounded: |/| ^ r <°o and uniformly continuous: Given 6 > 0 there is a 5 > 0 
such that I x — 〆 I < 5 = | /(x) ^ f{x f ) \ < e. Form Bernstein polynomials 


that is, 


E(f{S n /n)) = t/(k/n)p nk (x) y 

fc-0 


凡 W = p/W») 取 :_ , w(i - 


(c) Weierstrass theorem says that on [0, 1] there are polynomials which con¬ 
verge uniformly to/. 

Bernstein polynomials are such that 

\E(f(x) -J{S n /rt))\ = \/(x) - P n (x)\ 

=li ： (/W -f{k/n))p nk {x)\ ^ I ^ I + I E I 

I fc—na?| |Jk—n*| > ni 


The first partial sum is bounded by € E pnk(x) = e. The second partial sum is 

fc—o 
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bounded by 

2c 71 

2^ [ pnk(x) ^ ^ — nx) 2 p n k(x) = 2^x(1 — x)/nd 2 ^ r/2w5 2 ； 

|fc—»»| >na ” fc-H) 

note that the first inequality while algebraically immediate is due to Tchebi- 
chev’s inequality: 

£(| SJn - E{S n /n) | > 5) ^ fSjnW. 

Thus, for all ^ C [0, 1], as n co then € —> 0, 

\/(x) — P n (x) I ^ € + c/2nS 2 —> 0. 

Leaving out all references to the Bernoulli case, the most elementary proof 
known of Weierstrass theorem obtains: It introduces explicit uniformly ap¬ 
proximating polynomials and is primarily algebraic. 








Part One 


NOTIONS OF MEASURE THEORY 


No rigorous presentation of probability theory is possible without 
using the notions of sets, measures, measurable functions, and inte¬ 
grals. Their first lineaments are already apparent in elementary prob¬ 
ability theory. These notions are introduced and investigated syste¬ 
matically in this part. 

The presentation is self-contained, and the material will suffice for 
later parts. It is organized — at the cost of a few repetitions — so as to 
make the unstarred portions independent of the starred ones and, at 
the same time, to make the sections on measurable functions, conver¬ 
gence, and integration independent of the remainder except for 1.1 to 1.5. 
This permits a reorganization of the course so as to proceed from the less 
abstract notions toward more abstract and more involved ones. The 
following order is possible: 1.1 to 1.5 with 5.1 to 7.2, then 3.1, 3.2 with 
8.1, suffice for practically all of the unstarred portions of Parts II, III, 
then IV. 
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SETS, SPACES, AND MEASURES 


§ 1. SETS, CLASSES, AND FUNCTIONS 
* 

1.1 Definitions and notations. A set is a collection of arbitrary ele¬ 
ments. By an abuse of language, an empty set is a “set with no ele¬ 
ments.** 

Unless otherwise stated, all sets will be sets of elements of a fixed 
non empty set 12, to be called a space. Elements of Q will be called 
points and denoted by co, with or without affixes (such as subscripts, 
superscripts, primes, etc.). Capitals B, C, ...，with or without 
affixes, will denote sets of points, {w} will denote a set consisting of 
the one point co, and 0 will denote the empty set, that is, the set "con¬ 
taining no points.” If co is a point of A, we write A and, if co is 
not a point of A we write w ( A. 

A set of sets is called a class and classes will be denoted by Ct, (B, 6, 
…， with or without affixes. The class of all the sets in Q is called the 
space of sets in Q and will be denoted by S ⑼. Thus a class of sets in 
Q is a set in S ⑼ and all set notions and operations apply to classes 
considered as sets in the corresponding space of sets. 

A is said to be a subset of B, or included in B, or contained in B, if all 
points of A are points of B ； we then write A CL B or, equivalently, 
B A. In symbols, if co C implies co C -5, then A CL B, and con¬ 
versely. Clearly, for every set A, 

0 czJ CL^l, 

and the relation of inclusion is reflexive and transitive: 

A CL A\ A CL B and B CL C imply A CL C. 

A and B are said to be equal if ^ CL B and B CL we then write A = B. 

1 55 
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Clearly, the relation of equality is reflexive^ transitive, and symmetric: 

A = A\ A = B and B = C imply A = C\ 

A = B implies B = A. 

1.2 Differences, unions, and intersections. The difference A — B \s 
the set of all points of A which do not belong to 5; in symbols, 
and co C -5, then 03 A — B y and conversely. The particular differ¬ 
ence Q, — A, that is, the set of all points which do not belong to A, is 
called the complement of A and is denoted by A e . 

The intersection A C\ B, or simply AB, is the set of all points common 
to A and B; in symbols, if w C ^ and w C -8, then co C AB and con¬ 
versely. The union A U B \s the set of all points which belong to at 
least one of the sets A or B\ in symbols, if co C or co C -8, then co C 
A B and conversely. If AB = 0, then A and B are said to be dis¬ 
joint, and their union is then denoted by A B and called a sum. 

It follows from the definitions that the operations of intersection and 
union are associative, commutative 、 and distributive: 

U 5) U C = ^ U (5 U C), {AB)C = A{BC)-, 

J U B = B U AB = BJ ； 

{A U B)C = AC BC, {A U B){A U C) = A BC. 

Moreover, the operation of complementation has the following prop¬ 
erties: 

A d B implies A e 3 B e ; 

IT = 0 ， 0 C = Q, AA e = 0 ， A + A c = % {A c ) c = 

A - B = AB\ {A U B) c = A e B\ {AB) e = U B c . 

The notions of intersection and union extend at once to arbitrary 
classes. Let T be a set, not necessarily in 12, and to every / C as¬ 
sign a set A t C Q. The class {A t , / C of all these sets, or simply 
\A t \ if there is no confusion possible, is a class assigned to the index 
set T. 

The intersection, or infimum, of all sets of \A t \ is defined to be the 
set of all those points which belong to every A ti and is denoted by 

fl or by inf A t \ we drop t C. T if there is no confusion possible. 
t c T t G T 

In symbols, if co C ^ for every t C. T, then co C fl and conversely. 

The union, or supremum, of all sets of the class \A t \ is defined to be 
the set of all those points which belong to at least one A ti and is denoted 
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by U or by sup A t \ we drop t ^2 T if there is no confusion possible. 
t c T t G T 

In symbols, if co C ^ for at least one t d then co C U ^ and 
conversely. 

If all sets of \A t \ are pairwise disjoint, \A t \ is said to be a disjoint 
class and the union of its sets, denoted then by 22 ^t, is called a sum. 
Conversely, the term “sum” and the symbols 22 and + when used 
for sets of a class will imply that the class is disjoint. 

If co does not belong to at least one A t , then it belongs to every A t c , 
and conversely; consequently (de Morgan rule), 

(u ^t) c = n A t \ (n A t y = u A t \ 

When \A t \ is empty, that is, T is empty, it is natural to make the con¬ 
vention that \J A t = Then, in order to preserve the foregoing rela- 
t c 0 ‘ ~ 

tions, we have to make the convention that 门 4 = 12. Thus, by con- 

t G0 

vention, 

u a = 0 ， n a = 

t G0 t G 0 

It is easily seen, collecting all the relations so far obtained, that the 
following duality rule holds: 

Every valid relation between sets, obtained by taking complements^ unions, 
and intersections^ is transformed into a valid relation i/ y the symbols 
“ and “ c ” remaining unchanged, the symbols and % are in¬ 

terchanged with the symbols (J, Z), and 12, respectively. 

Operations performed on elements of “countable” classes will play 
a prominent role later in connection with the notion of measure. A 
set, or a class, is said to be finite ^ or denumerable y according as its ele¬ 
ments can be put in a one-to-one correspondence with the set {1, 2, 
•••,»} of the first n positive integers, for some value of », or with the 
set of all positive integers {1, 2, … ad infinitum}. It is said to be 
countable if it is either finite or denumerable. Similarly, operations 
performed on elements of finite, denumerable, or countable classes will 
be said to be finite 、 denumerable ， or countable operations^ respectively. 

The following immediate transformation of countable unions into 
countable sums will prove useful in connection with the notion of 
measure: 

U = 4 + A\ +.... 

1.3 Sequences and limits. To every value of » = 1, 2, .. assign 
a set A n \ these sets A ni whether distinct or not, are distinguished by 
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their indices. The ordered denumerable class A\ y 為 ，…， is called 
sequence A n . The set of all those points which belong to almost all 
A n (all but any finite number) is called the inferior limit of A ni and is 
denoted by lim inf A n . Clearly 

00 00 

lim inf = U fl 為 . 

n=lk^n 

The set of all those points which belong to infinitely many A n is called 
the superior limit of A n and is denoted by lim sup A n . Since every 
point which belongs to almost all belongs to a finite number of A n 
only, and conversely, it follows, by duality, that 

00 00 00 00 

lim sup a = (u n ^k c y = n u 為 . 

n=l k=n n=l k=n 

Every point which belongs to almost all A n belongs to infinitely many 
A n , so that 

lim inf A n C lim sup A n . 

Thus, if the reverse inclusion is true, lim inf A n and lim sup A n are 
equal to the same set A. Then A is called the limit of A n and is denoted 
by lim A n \ the sequence A n is said to converge to A and we write A n — A. 
Clearly, limits (inferior or superior) of sequences of sets are formed by 
denumerable set operations. 

Monotone sequences form a basic class of convergent sequences. A 
sequence A n is said to be monotone if it is either nondecreasing: A\ C A 2 
C • • •, and we then write A n | ; or if it is nonincreasing: A\ H …， 

and we then write A n J,. From the expressions above of inferior and 
superior limits, it follows at once that 

every monotone sequence is convergent^ and lim A n = \} A n or A n 
according as A n | or A n I. 

Moreover, if we consider this proposition as a definition of limits of 
monotone sequences then, since for an arbitrary sequence B n) 

00 00 

f| -5* = inf B k t and U ^ = SU P 瓜 I ， 

k=n k^n A:=*n n 

it follows that its inferior and superior limits can be defined by 
lim inf B n = lim (inf Bk) and lim sup B n = lim (sup Bk). 

n k^n n k^n 
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1.4 Indicators of sets. Set operations can be replaced by equivalent 
but more familiar ones, in the following manner. To every set A as¬ 
sign a function I a of co, to be called the indicator of A 、defined by 

Ia(o>) = 1 or 0 according as w A or A. 

Conversely, every function of co which can take only the values 0 and 1 
is the indicator of the set for the points of which it takes the value 1. 
The one-to-one correspondences (denoted by <=>) and relations listed 
below are immediate. 

Ia ^ Ib ^ ^ Cl B, Ia = Ib ^ ^ = By Iab = 0 <=» AB = 0, 

I 、 = 0’ Iq =1, /a + Ia c = 

-^inf At = inf"*/" 山 ， -^sup At = Slip ■/" 山 ’ 

^fl^n = n I A ni = ^ 1 A n , 

I = + (1 - ^Ai)^A2 + (1 - hi)(l - ^Ai)^Az + … 

^lim inf An = lim inf I An) I hm mp An = lim sup I Ani I Um An = lim I An . 

1.5 Fields and a-fields. Classes of sets in J2 are sets in the space 
S(Q,) of all sets in Q and thus what precedes applies to classes. How¬ 
ever, there is a notion specific to classes — that of closure under one or 
more set operations. A class Q is said to be closed under a set opera¬ 
tion if the sets obtained by performing this operation on sets of 6 are 
sets of 6. In particular, the class S(Q) of all sets in Q is closed under 
every set operation. 

In connection with the notions of measurability and of measure, two 
species of classes play a prominent role — fields and <r-fields. A field is 
a (nonempty) class closed under all finite set operations; clearly, every 
field contains 0 and A a-field is a (nonempty) class closed under all 
countable set operations; clearly every <r-field is a field. We observe 
that, because of the duality rule, closure under complementations and 
finite (countable) intersections implies closure under finite (countable) 

unions. Also we can interchange in this property “intersections” and 

<< * >> 
unions. 

Let S-classes be species of classes closed under set operations S; for 
example, the species of fields or the species of <r-fields. We observe that 
S(^l) is an S-class，whatever be the set operations S. 

a. Arbitrary intersections of ^classes are ^classes. In particular^ arbU 
trary intersections of fields or of a-fields are fields or c-fields ， respectively ， 
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For the intersection of a collection of S-classes belongs to every one of 
these classes. Therefore, performing operations S on sets of the inter¬ 
section, we obtain sets belonging to every one of these classes, that is, 
to the intersection. 

This property gives rise to the notion of a “minimal” S-class over a 
given class. An S-class Q f containing 6 is a minimal class over Q or 
the S-class generated by Q if every S-class containing e contains 6’. 

b. There is one y and only one，minimal %-class over a class Q. In par- 
ticular y there is one、and only one，minimal field and one y and only one 、 
minimal a-field over Q. 

For the intersection of all S-classes containing Q contains Q and is con¬ 
tained in every S-class containing Q. 

A space Q in which is selected a fixed <r-field Ci is called a measurable 
space (12, (i). If there is no confusion possible，the sets of ft are said to 
be measurable. 

1.6 Monotone classes. We shall need the notion of monotone 
classes in connection with the problem of extending measures on a 
field to its minimal <r-field. A monotone class is a class closed under 
formation of limits of monotone sequences. 

a. / a-field is a monotone field and conversely. 

The first assertion is obvious and the second follows from the fact that 
every countable intersection 门 A n and union U is a monotone 

n n 

limit of sequences 门 Ak and U Ak of finite intersections and unions. 

A;—• 1 A; ■ 1 

The property we shall require is as follows : 

A. The minimal monotone class £fll and the minimal a-field Q, over the 
same field Q coincide. 

Proof. On account of a and minimality of 3TI and a，it suffices to 
prove that £fH is a field; for, a monotone field 911 is a <r-field so that 
Z) a，and the cr-field d is monotone so that 3TI C (i. Since £fll ID 6 
3 Q and unions are reducible to intersections (by means of complemen¬ 
tations), it suffices to prove that, if A and B belong to 911， so do AB y 
A c B y and AB\ 

For every fixed A C] let EfTl^ be the class of all 5 G with the 
asserted property. Every 911^ is monotone for，if the sequence B n C 
is monotone，then B = lim B n belongs to 9TI and so do the limits of 
monotone sequences 

AB = lim AB ny A C B = lim A c B ny AB e = hm AB n \ 
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It follows that, for every ^ C. Q, the class coincides with 9H. For 
6 being a field, every 5 C 6 is G so that 6 C C 9H and, 
hence, 911 being minimal over 6 , 3Tl^ = 911. In fact, £dlg = 911 for every 
B C 9H. For, the conditions imposed upon pairs A y B being symmetric, 
B C 911( = 911^ for ^ C 6 ) is equivalent to ^ C 911s for every A Q 

so that 6 C and hence as above, Sfll^ = 9H. But this last property 

means that 911 is a field, and the proof is complete. 

*1.7 Product sets. We introduce now a different ty^e of set opera¬ 
tion and corresponding notions, for which we shall have need later. 
Let A x and A 2 be two arbitrary sets with elements coj and co 2 , respec¬ 
tively. By the product set A\ X 為 we shall mean the set of all ordered 
pairs co = (coi, co 2 ) where coi C 為 and 0)2 C 為 . If 為， B\, ... are 
sets in a space Qi and A 2 , B 2 ， • •. are sets in a space Q 2 , then A\ X 
B\ X B 2i • are sets in the product space Q,\ X 卩 2 ， called intervals or 
rectangles in Qi X ^2 and the properties below follow readily from the 
definition: 

{a x x a 2 ) n {B x x b 2 ) = {a x n b x ) x {a 2 n b 2 ) 

{Ai X ^2) — (Bi X B 2 ) = (^1 — Bi) X {A2 — B 2 ) + — 

x {A2 n B2) + n Bi) x (^2 ~ B2) 

In turn, it follows at once from these relations that 

a. If ©i and Q 2 are fields of sets in Q,\ and respectively, then the class 
of all finite sums oj intervals A\ X where A\ C 61 and A 2 C © 2 > 
is a field oj sets in X ^ 2 * 

This field will be called the product field of Qi and 62 . 

Yet，if di and Qs are <r-fields of sets in 12 ^ and Q 2 , respectively, then 
the product field of" di and Q2 is not necessarily a <r-field. The minimal 
<r-field over it will be called the product a-field di X If (fli, (ii) 
and (Q 2 , ^ 2 ) are measurable spaces, then their product measurable space 
is, by definition, (Qj X ^ 2 , ®i X %). 

Let Q = Qi X ^2 and a = a x X 0 , 2 - K ^ C Q Is measurable and 
o)i C S2i is a fixed point, then the set A{w\) of all points co 2 C ^2 such 
that a) = (wi, co 2 ) G ^ is called the section of A at coi ； similarly for the 
section ^(o) 2 ) at co 2 C ^ 2 ； by the definition, A{^\) C and ^(co 2 ) 
C fii. 

b. Every section oj a measurable set is measurable. 

For let 6 be the class of all measurable sets in Q whose sections are 
measurable. It is easily seen that 6 is a <r-field. On the other hand. 
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if / = 為 X 為 is a measurable interval, that is, A\ and A 2 are meas¬ 
urable, then every section of A is either empty or is A\ or A 2) so that 
6. Therefore, G = di X being the minimal <r-field over 
all measurable intervals, is contained in 6, and the assertion is proved. 

The foregoing definitions and properties extend at once to any finite 
number of sets and of measurable spaces. However, in the nonfinite 
case, some of these definitions have to be modified in order to preserve 
these properties. 

Let [A t , / C be an arbitrary collection of arbitrary sets A t in 

arbitrary spaces 9, t of points a) t . The product set At = XI is the set 

~ t c T 

of all the new elements coy = (co t , / C T) such that C A t for every 
/ C y. The product set At is in the product space fiy = XI Qt ； we drop 

- " * t G T 

"/ C T*' if there is no confusion possible. It follows from the foregoing 
definition that, for any set B, when the Qt are identical 

(n 為 ）x 5 = n (鴻 x 5 )， oj x 5 = u x b). 

Let Tn = {hy … ， /jv) be a finite index subset and let Af N be a set in 

the product space 9 ， t n > The set At n X ^t-t n a cylinder in Qy with 

base At n . If the base is a product set XI A ti the cylinder becomes a 

' . tCTt f 

product cylinder or an interval in Qy with sides Ati , C 7V. Let Q t be 

fields in It is easily seen that, as in the finite case, 

A. The class of all finite sums of all the intervals in Qy 切 “h sides A t C 
is a field of sets in Qy. 

This field is the product field of the fields 

Let &t) be measurable spaces. The minimal <r-field over the 
product field of the Q t is the product a-field = II °f measurable 
sets in Or, and the measurable space (fly, dr) is the product measurable 
space (XI fit, XI Q ， t) of the measurable spaces ⑼， Ctf). It is easily seen, 
as in the finite case, that b remains valid: 

B. Sections at (j»t n oj measurable sets in Qy are measurable sets in Clr-Ty 

*1.8 Functions and inverse functions. Perhaps the most important 
notion of mathematics is that of function (or transformation, or map¬ 
ping, or correspondence). We have already encountered functions de¬ 
fined on an index set T whose “values” are sets in Q. In general, a 
function X on z space 0 — the domain oj X — to a space ^— the range 
space oj X — is defined by assigning to every point co C S2 a point C S2 7 
called the value oj X at w and denoted by Jf(co). Sets and classes of 
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sets in will be denoted by A' y B' y . . •， and Of, (S', .. •， respectively. 
It will be assumed, once and for all，that functions are single-value'd^ 
that is, to every given co C ^ corresponds one, and only one, value 
X(o>). ' " 

The set of values of X for all oj C is called the image X(/f) of 
A (by X) and the class of images X(/f) for a\\ A ^ Q is called the image 
JC(G) of Q (by X); in particular X(Qi) is the range (of all values) of X. 
Thus, a function on 12 to Q! determines a function on S(Q) to 
While this new function is of no great interest, such is not the case for 
the inverse function that we shall introduce now. 

By [co; • •where ... stands for expressions and/or relations involv¬ 
ing functions on Q, we denote the set of points co C S2 for which these 
expressions are defined and/or these relations are valid; if there is no 
confusion possible we drop “co;”. Thus, [X = a /]， or inverse image of 
co 7 , is the set of all points co for which X{oi) = co 7 ; [Jf C A'\ or inverse 
image of A' y is the set of all points w for which X(oi) C A'\ and \A\ 
X{A) C C 7 ], or inverse image of Q', is the class of inverse images of all 
sets A' C We observe that the inverse image of an w' which does 
not belong to the range of is the empty set 0 in 9,. 

The inverse function X~ x of X is defined by assigning to every A' 
its inverse image [Jf C A% In other words, X~ l is a function on 
to 6 ■⑼ with values X~ x (A f ) = [Jf C ^]; if A' = {a/}，then we 
write X~ x (co 7 ) for X~ 1 ({co / }) = = o/]. Since Jfis single-valued, X~ x 

generates a partition of Q into disjoint inverse images of points a/ G • 
It follows readily that 

X~\A , - B f ) = 

^-'(u = u x-ka\), x-^n^t) = n xm." 

Therefore, 

A. Basic property of inverse functions: Inverse functions preserve 
all set and class inclusions and operations. 

It follows at once that 

If Qf is closed under a set operation so is X~ x {Q'). In particular, the 
inverse image of a a-field is a a-fieldy and the inverse image of the mini¬ 
mal a-field over Q’ is the minimal a-field over 

Moreover, 

If Gi is a a-field so is the class of all sets whose inverse images belong to &. 
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The notion of function can be “iterated” as follows. Let be a 
function on Q to and let X' be a function on to Q〃. Then, the 
function of function X'X defined by (U^co) = X\X{(j})) is a function 
on Q. to Q,". Clearly, its inverse function {X'X)~ x is a function on 
S(Q n ) to S(Q) such that, for every set A n C 0," 3 

{X'X)- l {A n ) = X~ l {X , - l {A n )) 

or, in a condensed form, 

{xx )- 1 = m 

*1.9 Measurable spaces and functions. So far, we did not consider 
particular species of functions. There are two species which play a 
basic role in abstract analysis. We shall introduce them now. But 
first we examine, in more detail, the class of inverse images of points 
of the range space. 

Let JC be a function on 0 to The partition of Q formed by the 
inverse images Jf -1 (a/) of all points co 7 C ^ is said to be induced (or 
determined) by X and X is said to be constant ( = co / ) on X~ x (co 7 ). Since 
the class of values X~ l {A') of X~ x is the inverse image of the <r-field 
of all sets A' in IV， it is a <r-field. If the partition induced by X is finite, 
or denumerable, or countable, then is said to be finitely ^ or denumerably, 
or countably valued t respectively; in other words, X is, say, countably 
valued if the set of its values is countable. Setting Aj = = o/y ]， 

we can write every countably valued function as a countable combi¬ 
nation of indicators: 

X=J ： CO W 

Conversely, we make the convention that every time such a “sum” is 
written, the sets Aj form a partition of the domain of the function X. 
If the w'j are distinct, then this partition is the one induced by the func¬ 
tion represented by the “sum.” 

Now, let (i be a fixed <r-field in Q. Q ，together with a, is called a 
measurable space (12, d)，and the sets of (i are then said to be measurable 
(although this terminology derives from the notion of measure, we em¬ 
phasize that, nowadays, the notion of measurability is independent of 
that of measure). A countably valued function X = $3 w， j^Aji where 
the sets Aj are measurable, is called a countably valued measurable 
function — for short, an elementary function; if X is finitely valued, then 
this elementary function is also called a simple Junction. Clearly 

the sets of the a-field induced by an elementary function are measurable. 
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We are now in a position to introduce the general notion of measurable 
functions. However, there are several ways for doing so, and the 
classes of measurable functions so defined are, in general, not the same. 

One way of defining measurable functions is to extend a basic property 
of inverse functions of elementary functions, as follows : Let (Q, Q) and 
(Q' ， d') be two measurable spaces. The inverse images by elementary 
functions on 12 to Q 7 of measurable sets are measurable. Extending this 
property, we say that a function J^T on 12 to is measurable if the in¬ 
verse images by X of measurable sets (G Of) are measurable (C ®). If, 
moreover, (Q 〃， d") is a measurable space and X' on Q! to Q 〃 is a meas¬ 
urable function, then X'X is measurable, for 

{X'X)- x (a") = {X~ l X , - x {a n )) c X~ l {a') c a. 

Thus, with this definition ^ a measurable junction of a measurable function 
is measurable. 

Another way of defining measurable functions is as follows ： Let 
(Q, d) be a measurable space on which are defined simple (elementary) 
functions to a space Q 7 (there are no measurable sets in Q'). A notion 
of limit is introduced on and measurable functions in the sense of this 
limit are then defined to be limits of convergent sequences of simple 
(elementary) functions. This approach is particularly suited for the 
introduction of integrals of measurable functions. Later we shall see 
cases in which measurable sets and the notion of limit are selected in 
such a manner that the two definitions are equivalent. 

*§ 2. TOPOLOGICAL SPACES 

The selections of measurable sets and of concepts of limit in range- 
spaces are rooted in the properties of the euclidean line: real line R = 

( — 00 ， +oo) with euclidean distance | ^ — .y | of points (numbers, reals) 
x } y. Species of spaces vary according to the preserved amount of 
these properties, an amount which increases as we pass from separated 
spaces to metric spaces, then to Banach spaces and to Hilbert spaces. 
We examine here the basic properties of these spaces and shall encounter 
them in various guises throughout the book. At the same time, the 
few notions of topology which follow are a recapitulation of the prop¬ 
erties of the euclidean line and, more generally, of euclidean spaces. 
We urge the reader to keep this fact constantly in mind by illustrating 
the concepts and their relationships in terms of euclidean spaces; for 
this reason, we denote here the points by y, z, with or without affixes. 
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Points, sets, and classes will be those of the space 9C under considera¬ 
tion, unless otherwise stated. 

We use without comment the axiom of choice., given a nonempty class 
of nonempty sets, there exists a function which assigns to every set of 
the class a point belonging to this set; in other words, we can always 
“choose” a point from every one of the sets of the class. 

2.1 Topologies and limits. A class 0 is a topology or the class of 
open sets if it is closed under formation of arbitrary unions and finite 
intersections and contains 0 and Q (the last property follows from the 
closure property by the conventions relative to intersections and unions 
of sets of an empty class). The dual class of complements of open 
sets is the class of closed sets ； hence it is closed under formation of arbi¬ 
trary intersections and finite unions and contains and 0. 

A topological space (9C, 0) is a space 9C in which is selected a topology 
0; from now on, all spaces under consideration will be topological and 
we shall frequently drop A topological subspace thereof {A, 0^) 

is a set A in which is selected its induced topology Qa which consists of 
all the intersections of open sets with A and is, clearly, a topology in 
A. It is important to distinguish the properties of A considered as a 
set in (9C, 0) from those of A considered as a topological subspace of 
(9C, 0). . _ 

To every set A there are assigned an open set A° and a closed set A 、 
as follows. The interior A 0 of A is the maximal open set contained in 
A 、 that is, the union of all open sets in A\ in particular, if A is open, 
then A 0 = A. The adherence A of. A is the minimal closed set contain¬ 
ing A, that is, the intersection of all closed sets containing A\ in par¬ 
ticular, if A is closed, then A = A. The definitions of interiors and 
adherences of A and A c are clearly dual, so that 

{A°) e = (J c ), {A e )° = {l) c . 

In topological spaces relations between sets and points are described 
in terms of neighborhoods. Every set containing a nonempty open 
set is a neighborhood of any point x of this open set; the symbol V x 
will denote a neighborhood of x. The points of the interior A 0 of A 
are “interior” to in other words, x is interior to / if" / is a V x . The 
•points of the adherence "A oi A are adherent to A\ in other words, x is 
adherent to A if no V x is disjoint from that is, x C {^ c )° = iA) c - 

Classical analysis is concerned primarily with continuous functions 
on euclidean lines to euclidean lines. In general, a function X on z 
topological domain 12 to a topological range space 9C is continuous at 
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co C ^ if the inverse images of neighborhoods of a; = X(a}) are neigh¬ 
borhoods of co ； ^ is continuous (on Q) if it is continuous at every co C S2. 
Since taking inverse images preserves all set operations, it follows 
readily that we can limit ourselves to open (closed) sets. Thus X is 
continuous if, and only if, the inverse images of open (closed) sets are 
open (closed) and, hence, a continuous function induces on its domain 
a topology contained in (no “finer” than) that of the domain. There¬ 
fore, if in topological spaces the a-fields of measurable sets are selected to 
be the minimal a-fields over the topologies ， then continuous functions are 
measurable. The importance of the concept of continuity is empha¬ 
sized by the fact that two spaces 9C and 9C ; are considered to be ^topo¬ 
logically equivalent” if, and only if, there exists a one-to-one corre¬ 
spondence -AT on 9C to 9C ; such that X and X 一 1 are continuous. 

The basic concept which distinguishes classical analysis from classical 
algebra and which gave rise to the various concepts examined in this 
section is that of limit of sequences of numbers. In a topological space 
it becomes: x is limit of a sequence x n or the sequence x n converges to x 
if, for every V x , there exists an integer n{F x ) such that x n C for 
all n ^ n{V x ). However, the need for a more general concept of limit 
is already apparent in the classical theory of integration where the par¬ 
titions of the interval of integration form a “direction” and the Riemann 
sums form a “directed set” of" numbers of which the Riemann integral, 
if it exists, is the “limit.” It so happens that this type of limit is pre¬ 
cisely the one required for general topological spaces, and we now de¬ 
fine the foregoing terms; the role of sequences in some species of spaces 
(including the euclidean ones) will be better understood when consid¬ 
ered within the general setup. 

Let be a set of points /, with or without indices. T is partially 
ordered if a partial ordering is defined on it. A partial ordering “•<，’’ 
to be read “precedes，” is a binary relation which is transitive (/•< /' 
and t' < t" imply / -< /")，reflexive (/ -< /), and such that, \i t < and 

t' < /, then / = t'\ upon writing t' > t when / -< the relation 
to be read “follows，” is also a partial ordering. 7 1 is a direction if it is 
partially ordered and if every pair /，〆 is followed by some t n (/ -< /"， 
t' <. /"). T is linearly ordered, and a fortiori is a direction, if every pair 
/， /’ is ordered (either t < t' or t' < /). For example, the sets in a space 
are partially ordered by the relation of inclusion and the neighborhoods 
of a point x form a direction (this is the root of the definition of limit 
as given below); the finite partitions of an interval of integration form 
a direction when ordered by the relation of refinement; integers and. 
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every directed set has at most one limit, 

every pair oj distinct points has disjoint neighborhoods, 

the intersection of all closed neighborhoods oj a point reduces to this 

point. 


The term “separated” expresses property (S 2 ). 

We observe that, according to (S 3 ), in a separated space every set 
reduced to a point is closed. 

Proof. (Si) and (S 2 ) are equivalent. Let x 7 ^ y. If x t x and 
x t —> y, then x t V x ^ for all those / which follow both t{V x ) and 
t{V y )^ since T is a direction such / exist so that no pair V Xi V y is dis¬ 
joint. 

Conversely, if no pair V Xi V y is disjoint, then there exist points 
z{V Xi V y ) ^LV x ?\Vy and, since these pairs form a direction when 
ordered by the relation (F X} V y ) < {V' Xi V' y ) if V x 3 V' x and V y 3 V' Vi 
these points form a directed set converging to both x and 

(S 2 ) and (S 3 ) are equivalent. If for every y 9 ^ x there exists a V x 
such that y C V X} then the intersection of all F x reduces to x. Con¬ 
versely, if the intersection of all V x reduces to the set formed by x, 
then, for every y 〆 x ，there exists a V x such that y V Xi and the open 
set (F x ) c is a neighborhood of y disjoint from V x . The proof is termi¬ 
nated. 

From now on, all spaces will be separated spaces. 


m general, sets of numbers are linearly ordered by the relation 
etc. 

A function Z on T to 9C can be represented by the indexed set of 
its values which may or may not be distinct but which are always dis¬ 
tinguished by their indices /. The indexed set is directed if T is a 
direction; sequences {x n } are special directed sets representing func¬ 
tions on the (linearly ordered) set of positive integers. We are now 
ready to define the general concept of limit. 

The point x is the limit of a directed set and we write x = lim x ty 
or, equivalently, x t converges to x and we write x t —> x, if, for every 
^xy there exists an index t(V x ) such that x t C for all those indices 
which follow t{V x ). However, the concept of limit is of use only if, 
when the limit exists, it is unique; this requirement leads to the intro¬ 
duction of “separated” or “Hausdorff” space as follows: 

A. Separation theorem. The following three definitions are equiva¬ 
lent. A topological space is separated if 


\ —/ \ —/ \ —/ 
12 3 

s s s 
/— > /—> /(\ 
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2.2 Limit points and compact spaces. Analysis of concepts or prop¬ 
erties leads to the introduction of “weaker” ones. A property (P is 
weaker than a property <9' if <9' implies (P; (P is a necessary condition for 
(P' and (P' is a sufficient condition for (P. 

Perhaps even more basic than the concept of limit is the weaker one 
of limit point. A point x is a limit point of the directed set if, 
for every pair /, V Xi there exists some t' > t such that x t > C V x , The 
definitions of limit and of limit point yield at once (i) and (ii) of the 
proposition below, and then (iii) follows. 

a. Let the sets A t be formed by all those points x t > for which t' follows t: 

At = [Xt'i /'>/}• 

(i) xt x if, and only i/ y for every V x there exists an A t C V x . 

(ii) x is a limit point of {xi} //, and only if 、 no pair A ti V x is disjoint. 

(iii) the set of all limit points of coincides with the intersection oj all 
/^t) and if x t x then this set reduces to the single point x. 

The reason for the somewhat confusing terminology above is that 
every limit point of is the limit of some subset of in the fol¬ 
lowing sense. A direction S of elements s, s\ ••- is a subdirection of 
the direction T when there exists a function J on S to T with the prop¬ 
erty that, for every /, there is an s such that, if s' follows s y then /'= 
/(/) follows /. The set {〜(*)} directed by the subdirection S of T Is a. 
subdirected set. Clearly, if x t —> x y then every subdirected set x/( s ) —> x. 

b. A point x is a limit point of a directed set if, and only if 、 the 
set contains a subdirected set which converges to x. 

Proof. The “if” assertion follows at once from the definitions. As 
for the “only if” assertion, it suffices for every pair s' = (/, V x ) to 
take f(s') = t' > t such that x t > G V x and direct the pairs by V x x ) 
> (/ 2 , V x 2 ) when t x > t 2 and V x l c V x \ 

Compact spaces are separated spaces in which every directed set has 
at least one limit point; a set is compact if it is compact in its induced 
topology. Compactness plays a prominent role in analysis and it is 
important to have equivalent characterizations of compact spaces. We 
shall use repeatedly the following terminology: a subclass of open sets 
is an open covering of a set if every point of the set belongs to at least 
one of the sets of the subclass. 
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A. Compactness theorem. The following three properties of separated 
spaces are equivalent: 

(Ci) Bolzano-Weierstrass property ： every directed set has at least 
one limit point. 

(C 2 ) Heine-Borel property: every open covering of the space contains 
a finite covering of the space. 

(C 3 ) Intersection property ： every class of closed sets such that all its 
finite subclasses have nonempty intersections has itself a nonempty 
intersection. 

If some class has the property described in (C 3 ), we say that it has the 
finite intersection property. 

Proof. The intersection property means by contradiction that every 
class of closed sets whose intersection is empty contains a finite sub¬ 
class whose intersection is empty. Thus, it is the dual of the Heine- 
Borel property, and it suffices to show that it is equivalent to the Bol¬ 
zano-Weierstrass one. 

Let be a directed set and, for every /o C T y consider the adher¬ 
ence of the set of all the x t with / following ,o. Since 7 1 is a direction, 
these adherences form a class of closed sets with finite intersection 
property. Thus, if the intersection property is true, then there exists 
an x common to all these adherences and it follows that x is a limit 
point of 

Conversely, consider a class of closed sets with the finite intersection 
property and adjoin all finite intersections to the class. The class so 
obtained is directed by inclusion so that, by selecting a point from every 
set of this class, we obtain a directed set. If the Bolzano-Weierstrass 
property is true, then this set has a limit point and this point belongs 
to every set of the class; hence the intersection of the class is not empty. 
This completes the proof. 

Compactness properties. 1° In a compact space、a directed set 
x t x if, and only if, x is its unique limit point. 

Proof. We use a and its notations. The “only if” assertion holds 
by a(iii). As for the “iP assertion, if ^ 4> x then, by a(i), there ex¬ 
ists a V x such that no A t is disjoint from V x c \ thus, for every / we can 
select a /' > / such that x t > C ^ H V x c . Since the space is compact, 
the subdirected set {^/} , hence, by b, the directed set {x<}, has a limit 
point x' C V x c . Therefore, x 7 ^ x' and x cannot be the unique limit 
point of 
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2° Every compact set is closed、and in a compact space the converse is 
true. 

Proof. Let A be compact and let V Xi V y {x) be a disjoint pair of 
open neighborhoods o( x A and y C By the Heine-Borel prop¬ 
erty, the open covering { V x \ of A where x ranges over A contains a 
finite subcovering { V Xk ) , and the disjoint open sets V = \} V Xki V = 

~ * k 

ft ^v( x k) are such that A d V and y C Thus, the open neigh- 

k _ 

borhood V of y contains no points of A; hence y Since y A c 

is arbitrary, it follows that A c and are disjoint, and the first asser¬ 
tion is proved. The second assertion follows readily from the inter¬ 
section property. 

3° The intersection of a nonincreasing sequence of nonempty compact 
sets is not empty. 

Apply the intersection property. 

4° The range of a continuous junction on a compact domain is com¬ 
pact. 

Proof. Because of continuity of the function, the inverse image of 
every open covering of the range is an open covering of the compact 
domain; hence it contains a finite open subcovering which is the inverse 
image of a finite open subcovering of the range. Thus, the range has 
the Heine-Borel property, and the assertion is proved. 

The euclidean line R = ( —<», +») is not compact but, according to 
the Bolzano-Weierstrass or Heine-Borel theorems, every closed inter¬ 
val [a, b\ is compact. These theorems become valid for the whole line 
if it is “extended” — that is, if points —« and +°o are added. Thus, 
the extended euclidean line R = [ —<», +»] is compact. In fact, R is 
locally compact and every locally compact space can be compactified 
by adding one point only, as below. 

A separated space is locally compact if every point has a compact 
neighborhood; it is easily shown that every neighborhood then contains 
a compact one. The one-point compactification of a separated space 
(SC, 0) is as follows. Adjoin to the points of 9C an arbitrary point <» f 9C 
and adjoin to the open sets all sets obtained by adjoining to the point 
oo those open sets whose complements are compact. Denote the topo¬ 
logical space so obtained by (SC*,, ©«,). 
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5° The one-point compactification of a locally compact but not com¬ 
pact space is a compact space，and the induced topology of the original 
space is its original topology. 

Proof. The last assertion follows at once from the definition of G^. 
As for the first assertion, observe that the new space is separated, since 
two distinct points belonging to the separated original space are sepa¬ 
rated and the point oo is separated from any x C 9C by taking a com¬ 
pact and hence closed V x c 9C, so that oo C V x \ Also, the new space 
has the Heine-Borel property, since an open covering of it has a member 
0 + {<»} with O c compact and hence contains a finite subcovering of O c 
which, together with 0 + {«}, is a finite subcovering of the new space. 

2.3 Countability and metric spaces. The euclidean line possesses 
many countability properties, among them separability (the countable 
set of rationals is dense in it) and a countable base (the countable class 
of all intervals with rational extremities); this permits us to define limits 
in terms of sequences only. In general topological spaces, a set A is 
dense in 5 if 5; in other words, taking for simplicity 5 = 9C, ^ is 
dense in 9C if no neighborhood is disjoint from A\ and B is separable if 
there exists a countable set A dense in B. A countable base at x \s z. 
countable class {V x {j)) of neighborhoods of x such that every neigh¬ 
borhood of x contains a V X {J); and the space has a countable base { V{j)} 
if, for every point x, a subclass of is a base at x. 

a. A space has a countable base only if it is separable and has a countable 
base at every point. Then every open covering of the space contains a 
countable covering of the space. 

Note that if a countable set {xy} is dense in a metric space, then at 
every Xj there is a countable base of spheres of rational radii, and the 
countable union of all these countable bases is a base for the space. 

Proof. If the space has a countable base { V{j )}, then it has a count¬ 
able base at every point. Moreover, if ^ is a set formed by selecting a 
point Xj from every V{j) t then, since any neighborhood of any point 
contains a V{j), it contains the corresponding point Xj, so that no 
neighborhood is disjoint from A. 

Finally, given an open covering of the space, every one of its sets 
contains a V{j) so that, for every V{j) y we can select one set Oj of the 
covering containing it. The countable class {Oy} is an open covering 
of the space, and the proof is terminated. 

A basic type of space with a countable base at every point is that of 
metric spaces. In fact, topologies in euclidean spaces are determined 
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by means of distances; this approach characterizes metric spaces. A 
metric space is a space with a distance (or metric) ^ on 9C X 9C to i? such 
that, whatever be the points x, y, z, this function has 

the triangle property: d(x, y) + d{x t z) ^ 2 ), 

the identification property ： d{x, y) = 0 «=> x = y. 

Upon replacing 2 by x and interchanging x and y t it follows that 

y) = d(y y x), d{x y y) ^ 0. 

It happens frequently, and we shall encounter repeatedly such cases, 
that, for some space, a function d with the two foregoing properties 
can be defined — except for the property = 0 x = y. Then the 

usual procedure is to identify all points x y y such that d{x y ^)=0; the 
space is replaced by the space of “classes of equivalence” so obtained, 
and this new space is metrized by d. 


The topology of a metric space (9C, d) is defined as follows: Let the 
sphere 厂 x (r) with “center” x and “radius” r(>0) be the set of all points 
such that d{x, y) < r. A set A is open if, for every x A y there exists 

a sphere V x {f) c A\ it follows, by the triangle property, that every 
sphere is open. Clearly, the class of open sets so defined is a topology. 
Since, by the identification property, d{x y y) > 0 when x ^ y and the 
spheres V x (^r) and V y {s) are disjoint for 0 < r, s ^ ^), it follows 


that with the metric topology so defined, the space is separated; we ob¬ 
serve that x n — x means that d{x ni x) —> 0. 

A basic property of the metric topology is that at every point x there 


is a countable base, say, the sequence of spheres V x 


O’” 


1 , 2 , 


and it is to be expected that properties of metric spaces can be charac¬ 
terized in countable terms. To begin with: 


1. Sequences can converge to .at most one point. 

2. A point x if) and only if, A contains a sequence x n —> x y so 
that a set is closed if, and only if, limits of all convergent sequences of its 
points belong to it. 

3. Every closed {open) set is a countable intersection {union) of open 
{closed) sets. 

4. A metric space has a countable base i/ y and only if ， it is separable. 

5. If X is a function on a metric domain (Q, p) to a metric space (9C, d) y 

then X(o) > ) — > X(ui) as (a if y and only i/ y X(o) n ) whatever 

be the sequence —> oj. 
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Proof. The first assertion follows from the separation theorem. 

The “if” part of the second assertion is immediate, and for the “only 

if” part it suffices to take x n ^_ A V x . 

For the third assertion, form the open sets O n = U (- ) ; those 

* xca \n/ 

sets contain so that A CL O n . On the other hand, for every 

x C 门 O n there exist points x n C O n such that x C ^ ( 士 ), and 

hence x n —*■ x\ since A is closed, it follows by the second assertion that 
x A y and hence / ID 门 On. Thus, closed A = f\O n and the dual 
assertion for open sets follows by complementations. 

The fourth assertion follows from a. 

Finally, if X^) —> X{(a) as ^ oj, then, clearly, X(<o n ) —> X(<o) 
as <o n (a. Since X^) +>■ Xifa) as —> a ； implies that there exist 

points w n C (i) such that X(<o n ) 4> X(<o), while w n —> oj, the last 

assertion follows. 


Metric completeness and compactness. The basic criterion for con¬ 
vergence of numerical sequences is the (Cauchy) mutual convergence 
criterion: a sequence x n is mutually convergent, that is, d(x my x n ) —> 0 
as w, » —> oo if, and only if, the sequence x n converges. In a metric 
space, if x n —> x, then, by the triangle inequality, d{x mi x n ) ^ d{xy x m ) 
+ d{x, x n ) — > 0 as w, » —» <», but the converse is not necessarily true 
(take the space of all rationals with euclidean distance); if it is true, 
that is, if d(x mi x n ) —> 0 implies that x n —> some x, then the mutual 
convergence criterion is valid, and we say that the space is complete. 
Complete metric spaces have many important properties, which follow. 

Call A(/f) = sup d{x, y) the diameter oi A\ A \s, bounded if A(^f) is 

x,y c a 

finite. 


A. Cantor’s theorem. In a complete metric space, every nonincreas¬ 
ing sequence oj closed nonempty sets A n such that the sequence of their 
diameters A(^ n ) converges to 0 has a nonempty intersection consisting oj 
one point only. 

Proof. Take x n C A n and m ^ n. Since d{x mi x n ) ^ A(// n ) —> 0, 
it follows that x n —> some x. Since x m C C A n for all w ^ » and 
the set A n is closed, x belongs to every A n \ hence x C fl If now 
d(x y x') > 0, then, from some k on, d(x, x*) > A(^) so that x' Ak 
3 n An. The assertion is proved. 
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A set A is nowhere dense if the complement of A is dense in the space, 
or, equivalently, if H contains no spheres, that is, if the interior of A 
is empty. A set is of the first category if it is a countable union of no¬ 
where dense sets, and it is of the second category if it is not of the first 
category. 

B. Baire’s category theorem. Every complete metric space is of the 
second category. 

Proof. Let A = \J A n where the A n are nowhere dense sets. There 
exist a point and a positive r\ < \ such that the adherence of 

V xi {r{) is disjoint from A x . Proceeding by recurrence, we form a de¬ 
creasing sequence of spheres such that 厂 知 ( 广 n) is disjoint from 

A n and r n < - —> 0. Therefore, by Cantor’s theorem, there exists a 
n 

point x C H ^xn( r n) and, because of the foregoing disjunction, x ^_\J A n . 
Thus / 〆 9C, and the theorem follows. 

We investigate now compact metric spaces and require the two fol¬ 
lowing propositions. 

b. If every mutually convergent sequence contains a convergent subse¬ 
quence, then the space is complete. 

This follows from the fact that if a sequence x n is mutually convergent 
and contains a convergent subsequence x n > —> x, then, by the triangle 
inequality, d(x n) x) ^ d{x n 、 x n ) -f d{x n 、 x) —> 0 as », n' —> oo } so 
that x n —> x. 

A set is totally bounded if, for every « > 0, it can be covered by a 
finite number of spheres of radii ^ e. Clearly, a totally bounded set 
is bounded, and a subset of a totally bounded set is totally bounded. 

c. A metric space is totally bounded i/y and only if y every sequence of 
points contains a mutually convergent subsequence. A totally bounded 
metric space has a countable base. 

Proof. Let the space be not totally bounded; there exists an « > 0 
such that the space cannot be covered by finitely many spheres of radii 
^ «. We can select by recurrence a sequence of points x n whose mu¬ 
tual distances are ^ e; for, if there is only a finite number of points 
, x m with this property, then the spheres of radius « centered 
at these points cover the space. Clearly, this sequence cannot contain 
a mutually convergent subsequence. 
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Conversely, let the space be totally bounded, so that every set is 
totally bounded. Then any sequence of points belonging to a set con¬ 
tains a subsequence contained in a sphere of radius ^ « — member of a 
finite covering of the set by spheres of radii ^ «. Thus, given a se¬ 
quence {x n }, setting « = U, …， and proceeding by recurrence, we 
obtain subsequences such that each is contained in the preceding one 
and the 是 th one is formed by points x 2 k, - - - belonging to a sphere 

of radius ^ ^ . The “diagonal” subsequence {^ nn } is such that, from 
k 

the 是 th term on, the mutual distances are ^ ^ ; hence this subsequence 

k 


is mutually convergent. 

The last assertion follows from the fact that given a totally bounded 

space, the class formed by all finite coverings by spheres of radii ^ - , 

n 

n = 1, 2 , … is a countable base. 


C. Metric compactness theorem. The three following properties of 
a metric space are equivalent: 

(MCi) every sequence of points contains a convergent subsequence; 

(MC 2 ) every open covering of the space contains a finite covering of the 
space {Heine-Borelproperty); 

(MC3) the space is totally bounded and complete. 

Proof. It suffices to show that (MC2) =» (MCi) (MC3) =» 
(MC 2 ). ' 

(MC 2 ) => (MCi). Apply the compactness theorem. 

(MCi) => (MC3). Let every sequence of points contain a convergent 
(hence mutually convergent) subsequence. Then, by b, the space is 
complete and by c, it is also totally bounded. 

(MC 3 ) => (MC 2 ). According to a, an open covering of a totally- 
bounded space contains a countable covering {Oy} of the space. If no 
finite union of the Oy covers the space, then, for every », there exists a 

n 

point x n ( \J Oj, and, according to c, the sequence of these points con- 

tains a mutually convergent subsequence. Therefore, when the totally 
bounded space is also complete, this sequence has a limit point x which 
necessarily belongs to some set 0j o of the open countable covering of 
the space. Since x is a limit point of the sequence {x n } y there exists 
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R 

some n > j 0 such that x n C Oj 0 C y Oy, and we reach a contradiction. 

y-i 

Thus, there exists a finite subcovering of the space. 

Corollary 1. A compact metric space is bounded and separable. 

Corollary 2. A continuous junction X on a compact metric space 
(£2, p) to a metric space (9C, d) is uniformly continuous. 

By definition, X is uniformly continuous if for every « > 0 there exists 
a 5 = 5(«) > 0, which depends only upon «, such that d(X(o)) y X(o3 , )) < « 
for p(aj, 03 > ) < S. 

Proof. Let « > 0. Since X is continuous, for every a； C ^ there ex¬ 
ists a 8 U such that d(X((o) } X{o)')) < «/2 for p(aj, < 25 w . Since the 
domain is compact, it is covered by a finite number of spheres 
k =* 1, 2, *••，》; let 5 be the smallest of their radii. Any <o belongs to 
one of these spheres, say, 厂„ 4 (5„ 4 )， and if p(aj, a/) < 5, then p(o3k, a/) < 
25 wt . It follows, by the triangle inequality, that 

dwu XW)) g d{XM, X(f,)) + X^)) <1 + 1 = ^ 

whenever p(aj, a/) < d ，and the corollary is proved. 

Let us indicate how a noncomplete metric space (9C, d) can be com¬ 
pleted, that is, can be put in a one-to-one isometric correspondence with 
a set in a complete metric space — in fact, with a set dense in the latter 
space. The elementary computations will be left to the reader. 

Consider all mutually convergent sequences s = (xi, 外， .*.)，〆 = 
(x'i， x 、 …）， …. The function p defined by p(s } s^) = lim d(x ni x' n ) 
exists and is finite and satisfies the triangular inequality. Let s y s' be 
equivalent if p(s y s f ) =0; this notion is symmetric, transitive, and re¬ 
flexive. It follows that the space (5, p) of all such equivalence classes 
is a metric space, and it is easily seen that it is complete. The one-to- 
one correspondence between 9C and the set S ( of classes of equivalence 
of all “constant sequences,” defined by x ㈠ (x } x ...)，preserves the 
distances. Moreover, S' is dense in S. Thus S may be considered as 
a “minimal completion” of 9C. 

Distance of sets. In what follows the sets under consideration are non¬ 
empty subsets of a metric space (9C, d). The distance of two sets A and B 
is defined by 


d{A, B) = inf {d{x, y):x C y C B] 
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and 

d{x^ B) = ^({x} y B) = \n({d{x y y): y C B] 

is called the distance of x to B. Clearly there are sequences of points 
x n [ A and y n C. B such that d{x ni y n ) — > d(A, E) and in particular 
y n ) — d(x ， E). 

d. d(x. A) is uniformly continuous in x and y in fact ， 

I d{x. A) - d{y. A) I < d{x, y). 


For, upon taking infima in z in the triangle inequality d(x y z) ^ ^(x,y) + 
d(j y z), we obtain dix. A) ^ d(x y y) + d(y } A) and interchanging x a.ndy 
the asserted inequality follows. 

D. (i) A = {x: d{x. A) = 0}. 

(ii) If disjoint sets A and B are closed then there are disjoint open sets 
U Z) d and V Z) B {^C is “norma!”') and there is a continuous function g 
with 0 ^ ^ ^ \ y g = 0 on A, g = l on B (**Urysohn lemma”、. 

(iii) If a compact A and a closed B are disjoint then d{A y B) > 0. If 

moreover B is also compact then d{A t B) = d(x y y) for some x A and 
yCB. .. 

Proof. We use continuity in x of d(x. A) without further comment. 
The set A' = {x: d(x, A) = 0} contains A and is closed as inverse image 
of the closed singleton {0} under a continuous mapping. Let a sequence 
of points x„ of A be such that d(x y x n ) —> d(x y A). Then d(x, x n ) —> 0 for 
every x^_A' so that x^_A hence A' is contained in A. Thus (i) is proved. 


In (ii), the “normality” assertion follows by (i) and continuity in x of 
d{x. A) — d(x } B) upon taking U = {x: d{x y A) — d{x y 5) < 0} D ^ 
and V - {x: A) — d(x y 5) > 0} D B. “Urysohn lemma” obtains 


with g(x) 


d{x. A) 

d(x. A) + d{x y B) 


For (iii), let sequences of points x„ of A and y n of B be such that 
d(x n} y n ) —> d{A, B). Since A is compact the sequence (x n ) contains a 
subsequence x n > x A hence d(x) y n >) d{A y B). If d(A, B) = 0 
then y n > x so that, B being closed, xC. B and A and B are not disjoint. 
Since they are disjoint, d{A, B) > 0. If, moreover, also B is compact 
then the sequence of points of B contains a subsequence "— C 5 
hence d{x^y) = d{A, B). The proof is terminated. 

2.4 Linearity and normed spaces. Euclidean spaces are not only- 
metric and complete but are also normed and linear as defined below. 
Unless specified, the “scalars” a, 彡 ， with or without subscripts, are 
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either arbitrary real numbers or arbitrary complex numbers, and x y y, z, 
with or without subscripts, are arbitrary points in a space 9C. 

A space 9C is linear if a “linear operation” consisting of operations of 
“addition” and “multiplication by scalars” is defined on 9C to 9C with 
the properties: 

(i) x + y = y + x, x (j z) = {x + y) 

x z = y z =» x = y ； 

(ii) 1 -x = x, a(x + y) = ax -ay t (a + = ax bx, 

a{bx) = {ab)x. 

By setting —y = — 1 -y, “subtraction” is defined by x — = x 4 - (~y)- 

Elementary computations show that (i) and (ii) imply uniqueness of 
the “zero point” or “null point’’ or ‘‘origin’’ 0， defined by d = 0-x, and 
with the property x 6 = x. A set in a linear space generates a linear 
subspace — the linear closure of the set — by adding to its points x y y y 
••• / all points of the form ax by ' It. 

A metric linear space is a linear space with a metric d which is in¬ 
variant under translations and makes the linear operations continuous: 


(iH) 

y) 

=d{x - y, d)y 


=> ax n —> 6y 



—> 0 ==> 

a n x — 

e. 

If 





(iv) 

y) 

= d{x - y, e), 

d(aXy 6) 

=| a \d{x y d) y 


then (iii) holds, d(x } 0) is called norm of x and is denoted by || x ||, and 
the metric linear space is then a “normed linear space.” 

Equivalently, a normed linear space is a linear space on which is de¬ 
fined a norm with values || x || ^ 0 such that 

(v) ||^ + ^|| ^ ||x|| + ||j||, IIx|| = o <=^ x = 

|| 似 || = |斗|卜||， * 

and the metric d is determined by the norm by setting 

y) = I 卜 - ||. 

A Banach space is a normed linear space complete in the metric de¬ 
termined by the norm. For example, the space of all bounded continu¬ 
ous functions /on a topological space 9C to the euclidean line is a Banach 
space with a norm defined by |}/[| = sup | f{x) |. Real spaces witli 

X 

points x = (^ i , • • •, xn) and norms || ^ || = (| ^1 | r + • • • + | | r ) 1 

r ^ 1, are Banach spaces, and we shall encounter similar but more gen. 
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eral spaces L r . If r = 2 , then these (euclidean) spaces are Hilbert 
spaces, 

A Hilbert space is a Banach space whose norm has the parallelogram 
property: || -v + ^ || 2 + || -v — ^ || 2 = 2 || x || 2 + 2 || y || 2 ; such a norm 
determines a scalar product. It is simpler to determine the Hilbert 
norm by means of a scalar product (corresponding to the scalar prod- 

N 

uct defined by (x y y) = X) x kyk in a euclidean space R N ) as follows: 

k =* 1 

A scalar product is a function on the product of a linear space by it¬ 
self to its space of scalars, with values y) such that 


(vi) (ax + by, 2 ) = a{x, 2 ) + b(j y 2 ), (x t y) = (j t x) t 

x 9 ^ 9 (x y x) > 0. 

Clearly (x y is real and nonnegative. The function with values 
II x II = (^, ^ 0 is the Hilbert norm determined by the scalar prod¬ 

uct. For, obviously, it has the two last properties (v) of a norm. And 
it also has the first property (v). This follows by using in the expansion 
of (x + y y x + y) the Schwarz inequality 

I ( x yy) I 各 II 方 II.IMI; 

when (x y _y) = 0 this inequality is trivially true, and when (x y jy) 〆 0 
it is obtained by expanding (x — ay y x — ay) ^ 0 and setting a = 
(x y x)/(j y x). Finally, the parallelogram property is immediate. 

Linear functionals. The basic concept in the investigation of Banach 
spaces is the analogue of /(x) = cx — the simplest of nontrivial functions 
of classical analysis. A junctional / on a normed linear space has for 
range space the space of the scalars (the scalars and the points below 
are arbitrary, unless specified), /is 

linear if /(ax + by) = af{x) + bf{y); 

continuous if /(^ n ) —> f{x) as x n —> x\ if this property holds only 
for a particular x } then / is continuous at this x; 

normed or bounded if |/(^) [ ^ f|| ^ |[ where f < °o is independent of 

x; the norm of /is then the finite number ||/|| = sup^-^p- 

‘ ' 一 I 卜 II 

For example, a scalar product (x y y) is a linear continuous and normed 
functional in x for every fixed y. Clearly, if / is linear, then/(0) = 0 , 
and a linear functional continuous at 9 is continuous. 
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a. Let j be a linear Junctional on a normed linear space. Ij / is normed^ 
then it is continuousi and conversely. 

Proof. If / is normed, then it is continuous, since 

|/W —/W I = \f{x n — a:) I ^ f|| jsr n — a: || -» 0 as || A： n — a: || -» 0 . 

If f is not normed, then it is not continuous, since whatever be n there 
exists a point such that \/(x n ) | > n\\ x n ||, and, setting y n = x n / 

n\\ x n II，we have \ f{y n ) | > 1 while || || = ~ -» 0 . 

n 

b. The space of all normed linear functionals J on a normed linear space 
is a Banach space with norm 丨丨 / 丨丨 . 

Proof. Clearly the space is normed and linear and it remains to 
prove that it is complete. 

Let \\fm — jn || —> 0 as wz, « —^ oo. For every c > 0 there exists 
an n t such that \\/ m — j n || < c for m } n ^ n f ; hence \/ m (x) — J n {x) \ < 
6 || ^ || whatever be x. Since the space of scalars is complete, it follows 
that there exists a function f oi x such that f n {x) —» fix) and, clearly, 
/is linear and normed. By letting m —» oo } we have, for n ^ |/(^) — 

/ n (^)I ^ c|[ ^ [| whatever be x, that is, ||/ n —/|| ^ Hence J n —» / 
and the proposition is proved. 

What precedes applies word for word to more general functions (map¬ 
pings, transformations) on a normed linear space to a normed linear 
space with the same scalars, and the foregoing proposition remains valid, 
provided the range space is complete; it suffices to replace every |/(^) | 

by ||/W ||. 

The Banach space of normed linear functionals on a Banach space is 
said to be its adjoint; a Hilbert space is adjoint to itself. However, 
a priori ， the adjoint space may consist only of the trivial null functional 
/ with ll/ll =0. That it is not so will follow (see Corollary 1 ) from 
the basic Hahn-Banach 

A. Extension theorem. Iff is a normed linear junctional on a linear 
subspace A of a normed linear space、then f can be extended to a normed 
linear functional on the whole space without changing its norm. 

Proof. 1 ° We begin by showing that we can extend the domain of 
/ point by point. Let A and let j[/|[ = 1 — this does not restrict 
the generality. First assume that the scalars, hence/, are real. 

The linearity condition determines /(x + ax 0 )，x A, by setting it 
equal to f{x) + so that it suffices to show that there exists a 
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number f(x 0 ) such that |/(^) + a/(x 0 ) | ^ || ^ + ax 0 || for every x A 
and every number a. Since A \s b. linear subspace, we can replace x 
by ax and, by letting x vary, the condition becomes 

sup 卜| 卜 + 太 0 || -/⑻} ^J{xq) ^ inf {|| ^ + ^0 || - Ax)). 

X X 

Therefore, acceptable values of /(xq) exist if the above supremum 
is no greater than the above infimum, that is, if whatever be x\ x n C A 

-|| ^ + II - fix') S II x" + ^0 II -/(〆 '） 

or 

Ax") - fix') ^\\x f， + II + II ^ + ^0 ||. 

Since by linearity of / and the triangle inequality 

/(〆 '）-m =/(〆' 一 〆）$ || 〆' 一 〆 || $ || 〆'+ 外 || +1| 〆 + 外 II ， 
acceptable values of /(x 0 ) exist. 

We can pass from real scalars to complex scalars, as follows: From 
f(ix) = i/(x) it follows that /(x) = g(x) — ig{ix) y x A y where g — (Rf 
is a real-valued linear functional with [j ^ || ^ 1 ; g extends first for all 
points x + ax 0 then for all points (x + ^o) + b*tx^ = ^ + (^ + ib)xo y 
a y b real, and / extends by the foregoing relation. Now observe that / 
is linear on the so extended domain and that, for any given point x ， 
upon setting f{x) = re ia y r ^ 0, a real，we obtain |/(^) | = g(e^ la x) ^ 

HI- ... . 

2° We can extend the domain of / point by point. The family of 
all possible extensions of / to linear functionals without change of norm 
is partially ordered by inclusion of their domains. Any linearly ordered 
subfamily of extensions has a supremum in the family — the extension 
on the union of the domains. According to a consequence of the axiom 
of choice (Zorn’s theorem), it follows that the whole family has a su¬ 
premum which is a member of the family. It must have for domain 
the whole space, for otherwise, by 1 °, it could be extended further. 
The theorem is proved. 

Corollary 1. Let be a nonzero -point of a normed linear space, 
and let A be a closed linear subspace. There exist linear functionals /, 
f' on the space such that 

ll/ll = 1 and f{x Q ) = || II ， 
f ■= 0 on A and f{x Q ) = d{x Qi A) = inf d{x Qi x). 

x ca 

Set f{ax Q ) = a|| || ， /'( 似 0 + -v) = ad(x 0i /f), x C. and extend. 
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Corollary 2. A junctional f on a set A in a normed linear space ex- 
tends to a normed linear junctional on the whole space with norm bounded 
by r(<oo) if y and only if y 

I Z a k /(x k ) I ^ f || Z ^kX h || 
k k 

whatever be the finite number of arbitrary points A and of arbitrary 
scalars a k . 

Proof. The “only if” assertion is immediate. As for the “if” asser¬ 
tion, assume that the inequality is true, and observe that the linear 
closure of A consists of all points of the form x = ^2 a k x k> Linearity 

k 

of / on this closure implies that we must set f{x) = a kf{ x k)^ Then, 

k 

on the closure, \/{x) | ^ c\\ x ||, and / is uniquely determined, since, 
for ^ = 2Z a k^k = H a ， k r ^k f y we have 

k k f 

1 n a kf( x k ) _ n av/c^o I ^ ^ I! zi a ^k — 5Z ^k f x r k f || = o. 

k k f k k f 

The assertion follows by the extension theorem. 

This corollary permits us to solve various moment problems as well 
as to find conditions for existence of solutions of systems of linear equa¬ 
tions with an infinity of unknowns. 

§ 3. ADDITIVE SET FUNCTIONS 

3.1 Additivity and continuity. A set junction ip is defined on a non¬ 
empty class 6 of sets in a space 12 by assigning to every set C (2 a 
single number <p{A) y finite or infinite, the value of <p at A. If all values 
of ^ are finite, <p is said to be finite 、 and we write | ^ | < °°. If every 
set in 6 is a countable union of sets in Q at which <p is finite, <p is said to 
be (T-finite. To avoid trivialities, we assume that every set function 
has at least one finite value. Unless otherwise stated, <p denotes a set 
junction and all sets considered are sets of the class on which this junction 
is defined、so that the properties below are valid as long as <p is defined for 
the sets which appear there. 

<p is said to be additive if 

<p(H ^j) = H 

either for every countable or only for every finite class of disjoint 
sets. In the first case <p is said to be countably additive or c-additive^ 
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and in the second case <p is said to be finitely additive. In order that 
sums 2Z 沪 «) be always meaningful we have to exclude the possibility 
of expressions of the form +00 — 00. In fact, if the sums always exist, 
<p is defined on a field, and <p{A) = +00 and <p{B) = — 00, then <p(Q )= 
<p{A) + = +°o and 史⑼ = <p(B) + <p(B c ) = — 00, while the func¬ 

tion <p is single-valued. Thus, by definition, 

an additive set function has the additivity property above、and one of the 

values +00 or — °o is not allowed. 

To fix ideas we assume that the value — °o is excluded, unless otherwise 
stated. 

A nonnegative additive set function is called a content or a measure 
according as it is finitely additive or <r-additive. Let <p be additive. 

A H By then, by additivity, 

= <p(B) + <p{A — B). 

It follows, upon taking A = B 0 = B with <p(B) finite, that <p(0) = 0. 

A convergent series of terms, which are not necessarily of constant 
sign, may depend upon the order of the terms. This possibility is ex¬ 
cluded in our case by 

&. 1/ <p is a-additive and | <pQ2 A n ) | < then the series 2Z <p(d n ) is 
absolutely convergent. 

Proof. Set A n + = A n or 0 according as <p{A„) ^ 0 or <p{A n ) < 0, 
and set A n ~ = A n or ^ according as <p(/4 n ) ^ 0 or <p{A n ) > 0. Then 

<p(H ^n + ) = Z <p{^n + )i ArT) = Z 

and the terms of each series are of constant sign. Since the value —00 
is excluded, the last series converges. Since the sum of both series 
converges, so does the first series. The assertion follows. 

b. If <p{A) is finite and A ZD then <p(B) is finite; in particular^ if 
<p{p) is finite t then <p is finite. If <p H then <p is nondecreasing: <p{A) ^ 
<p(B) for A ID B, and subadditive: ^?(U ^i) ^ 

Only the very last assertion needs verification and follows from 

v?(U ^i) = + A\ A% c Az + ...) 

= <p{A\) + <p{Ax c A 2 ) + <p{A\ A% c A-^) +... 

^ <p{A\) + 以為 ） + <p{^z) + .... 
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We intend to show that the difference between finite additivity and 
<r-additivity lies in continuity properties. <p is said to be continuous 
from below or from above according as 

^(lim A n ) = lim <p{A n ) 

for every sequence f , or for every sequence A n j such that <p{A n ) is 
finite for some value « 0 of n (hence, by b, for all n ^ n Q ). If ^ is con¬ 
tinuous from above and from below, it is said to be continuous. Con¬ 
tinuity might hold at a fixed set A only, that is, for all monotone se¬ 
quences which converge to A\ continuity at 0 reduces to continuity 
from above at 0. 

A. Continuity theorem for additive set functions. A a-addittve 
set junction is finitely additive and continuous. Conversely, if a set func¬ 
tion is finitely additive and, either continuous from be low y or finite and 
continuous at 0, then the set Junction is a-additive. 

Proof. Let <p be <r-additive and, a fortiori^ additive. <p is continuous 
from below, for, if A n f , then 

lim A n = (J A n = A\ — /fi) + (/^3 — A 2 ) + ... 

so that 

^j(lim A^) = lim + 沪(為 一 O + ...+ <p{A n — A n —\)} 

=lim <p{A n )- 

<P is continuous from above, for, if A n i and <p(// no ) is finite, then 
A ntj — A n t for n ^ »o> the foregoing result for continuity from below 
applies and, hence, 

^(^n 0 ) - <p{WmA n )= 沪 (lim (/f no - A n )) = lim <p{A na — A n ) 

= p(/ no ) — lim <p(/4 n ) 
or 

<p(lim J n ) = lim <p{A n ). 

Conversely, let <p be finitely additive. If <p is continuous from below, 
then 

<p(H ^n) = <p(lim 2Z ^k) = lim <p{ X) A k ) = lim 2Z <p{^k) - Z 

左 =1 ^ = 1 左 

so that <p is <r-additive. If p is finite and continuous at 0, then <r-addi- 
tivi ty follows from 
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yi qo n oo 

^n) = (Pi'll ^k) + ^( Z) A k ) = 2Z + <p( 5Z ^k) 

左 =*1 A=*n-f-l kaal &»»n+l 

and 

00 

<p( 2Z ^k) —> <p(0) = o. 

fcawn-f-l 

The proof is complete. 

The continuity properties of a <r-additive set function <p acquire their 
full significance when <p is defined on a <r-field. Then, not only is <p de¬ 
fined for all countable sums and monotone limits of sets of the <r-field 
but, moreover, <p attains its extrema at some sets of this <r-field. More 
precisely 

c. If <p on a a-field Ct is <r-additive, then there exist sets C and D of Q, 
such that <p{C) = sup <p and <p{D) = inf <p. 

Proof. We prove the existence of C; the proof of the existence of D 
is similar. If <p{A) = +» for some Ad then we can stt C = A 
and the theorem is trivially true. Thus, let 沪 < oo, so that, since the 
value —oo is excluded, <p is finite. 

There exists a sequence \A n \ C a such that <p(J n ) —> sup <p. Let 
A = (J A n and, for every », consider the partition of A into 2 n sets 

n 

A nm of the form fl where A'u = A k ov A — A k \ for n < n\ every 

^nm is a finite sum of sets A n > m >. Let B n be the sum of all those A nm 
for which <p is non negative; if there are none, set B n = 0. Since, on the 
one hand, A n is the sum of some of the A nm and, on the other hand, for 
n' > «, every A n > m > is either in B n or disjoint from B ny we have 

<pW ^ <p{B n ) ^ ^(B n U B n+l U-.-U B n ,). 

Letting n' —» oo, it follows, by continuity from below, that 

00 

<p{^n) ^ <p(B n ) ^ <p( U B k ). 

k 篇 n 

00 

Letting now » —» oo and setting C = lim -(J Bk, it follows, by con- 

k 観 n 

tinuity from above (<p is finite), that sup <p ^ <p(C). But <p{C) ^ sup <p 
and, thus, <p(C) = sup <p. The proof is complete. 

Corollary. If <p on a a-field d is cr-additive {and the value — oo is 
excluded), then <p is bounded below. 
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3.2 Decomposition of additive set functions. We shall find later that 
the “natural” domains of <r-additive set functions are <r-fields. We in¬ 
tend to show that on such domains cr-additive set functions coincide 
with signed measures^ that is, differences of two measures of which one 
at least is finite. Clearly, a signed measure is <r-additive so that we 
need only to prove the converse. 

Let <p be an additive function on a field 6 and define <p + and <p~ on 
C by 

<P + {^) = sup <p(B), <p~(A) = — inf ip{B), A, B C. Q. 

B a A B cz A 

The set functions <p + t <p~ and <p = <p + + <p~ are called the upper, lower 、 
and total variation of <p on 6, respectively. Since <p(0) = 0, these varia¬ 
tions are nonnegative. 

A. Jordan-Hahn decomposition theorem. If <p on a cr-field d is 
c-additive, then there exists a set D such that^ for every Ad 

~<P~{A) = <p{AD), = <p{AD c ). 

and <p~ are measures and <p = 沪 + _ <p~ is a signed measure. 

Proof. According to 3.1c, there exists a set D G Ct such that <p(D) 
=inf <p; since the value — °o is excluded, we have 

— °o < <p(D) = inf <p ^ 0. 

For every set A G Ct, <p{AD) ^ 0 and <p{AD c ) ^ 0, since <p ^ <p(D) 
while, if <p{AD) > 0, then 

<p(D — AD) = <p(D) — <p{AD) < <p(D)y 

and if <p{AD c ) < 0, then 

<p(D + Aiy) = <p{D) + <p(AD c ) < <p{D). 

It follows that, for every B (Z A, B G Ct)> 

<p(B) ^ ^ <p(BD c ) + ip{{A - B)D C ) = <p(AD c ) y 

and, hence, ^ <p{AD c ). Since AD C is one of the B f s y the reverse 

inequality is also true. Therefore, for every ^ C = <p{AD c ) 

and, similarly, — <p~(A) = so that 

<p{^) = <p{AD c ) + <p(AD) = <p + {A) — <p~(A). 

Moreover, <p + on ft is a measure since <p + ^0 and 

々 Km a^) = e = z <p + w- 
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Similarly ip— on (i is a measure and, furthermore, it is bounded by 
一 <p(D) which is finite. Thus, <p = <p + —妒一 is a signed measure, and 
the proof is complete. 

Jordan decomposition. If a is only a field but <p is also bounded ， 
then it is still a signed measure. Prove, proceeding directly from the 
definitions, showing first that are bounded measures. 

*§ 4. CONSTRUCTION OF MEASURES ON (T-FIELDS 

4.1 Extension of measures. If two set functions <p on Q and 〆 on 
6’ take the same values at sets of a common subclass e", we say that 
<p and <p f agree or coincide on If 6 C (3’ and <p and <p f agree on 6, 
we say that <p is a restriction of <p f on 6, and 〆 is an extension of ip on 
6’. The general extension problem can be stated as follows : find ex¬ 
tensions of <p which preserve some specified properties. If, given Q f ZD Q y 
there is one, and only one, such extension on 6’， we say that this ex¬ 
tension is determined. 

Here, we are concerned with the extension of measures to measures 
and shall denote extensions and restrictions of a measure ju by the same 
letter; as long as their domains are specified, there is no confusion pos¬ 
sible. While any restriction of a measure is determined and is a meas¬ 
ure, an extension of a measure to a measure on a given class may not ex¬ 
ist, and if one exists it may not be unique. Our aim is to produce classes 
on which such extensions exist, and cases where they are determined. 
The results of the investigation are summarized by the Caratheodory 

A. Extension theorem. A measure & on a field Q can be extended 
to a measure on the minimal a-Jield over Q. I/ y moreover^ ju is cr-finite y 
then the extension is determined and is <r-finite• 

We prove the extension theorem by means of an intermediate weaker 
extension which preserves a part only of the properties characterizing 
a measure. We shall need various notions that we collect here. 

A set function fj. 0 on the class ⑼ of all sets in the space S2 is called 
an outer measure if it is sub or-additive, nondecreasing, and takes the 
value 0 at 0: 

M°(U mK) for every countable class {^, 

y°{A) ^ for J (Z B y ，⑼ = 0. 

A set A is called ^-measurable if, for every set D C S2, 

m 0 (D) g + m°(^ c D). 
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Since the relation is always true when 〆(£)) = 00 , it suffices to consider 
sets D with /x 0 (D) 〈的 • Since 〆 is sub <r-additive, the reverse inequality 
is always true and, hence, A is /^-measurable if, and only if, 

m°(D) - ^°{AD) + 

The class of all M 0 -measurable sets will be denoted by d° and, clearly, 
contains 0 and The outer extension of a measure /x given on a field 
0 is defined for all sets ^ d Q by 

\i°{A) = infX) 

where the infimum is taken over all countable classes \Aj\ C Q such 
that yf C ： (J Aj — ■coverings in 6 of for short. Since S2 C 6, there is 
at least one covering (consisting of S2) in Q of every A so that the defi¬ 
nition of an outer extension is justified. The use of the same symbol 
ix° both for an outer measure and an outer extension is due to the prop¬ 
erty, to be proved first, that the outer extension of the measure /x on 6 
is an extension of to an outer measure. Next we shall prove that the 
restriction to GL° of fi° is a measure and that Oi 0 is a or-field, and the 
extension theorem will follow. 

a. The outer extension of a measure on a field Q is an extension of 
ju to an outer measure. 

Proof. We prove first that m 0 is an extension of ji. 

If ^ C 6, then ^ m(J). On the other hand, since /x is a meas¬ 
ure, m(^) = for every covering \Aj\ in Q of A y so that n(A) 

^ and, hence, ^°{A) == for A ^ Q. It remains to prove 

that fx° is an outer measure. 

To begin with, 〆⑼ = 0 since 0 C 6. Furthermore, m°(^) = M 。 ( 万） 
for A d B y since ev ery covering in 6 of 5 is also a covering of A. Finally, 
we prove that 〆 is sub <r-additive. 

Let € > 0 and let \Aj\ be an arbitrary countable class. For every 
Aj there is a covering \Aj^\ in Q such that 

k 2 

Since \J ^ [j ^jky it follows that 

j jfk 

m°(u ^j) ^ z 以 jk) m + 心 

i j 

and, 6 > 0 being arbitrarily close to zero, sub or-additivity is proved. 
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b. If p . 0 is an outer measure，then the class d° of \i°-measurable sets is a 
a-field and p 0 on Gi 0 is a measure. 

Proof. We prove first that d° is a field and ii° on Ql° is a content. 

If C GL°y then A c C since the definition of /immeasurability is 
symmetric in A and A c . 1( B C. then AB C since 

/(£)) = n°{AD) + fx°(J c D) 

= n°{ABD) + y.°{AB c B) + n°{A c BD) + n°{A c B c D) 

^ y.°{ABB) + y.°{AB c D U A C BD U A C B C D) 

= n°{ABD) + \i\ABYD. 

Thus Q° is closed under complementations and finite intersections and, 
hence, under finite unions, so that Cfc° is a field. 

is finitely additive on GL° since, \{ B QL° and are disjoint, 

y.°{A + B) = + B)A) + y.°{{A + B)A C )= ，⑻ + M 。 ⑻. 

Since \i°{A) ^ fi°(0) = 0, n° on Ct° is a content. 

To complete the proof, it suffices to show that, if the A n C are 
disjoint, then A = ^ A n £. a 0 and n°{A) = Y, 

n 

Since B n = YL Q>° y we have 

A ； 2*1 


y-°{D) = n\B n D) + n°(B n c D) > E y.°{A k D) + y.\A c D) 

kssz\ 

and, letting n <x> y 

M°(0) ^ E M 0 (摘 + ^°{A C D) > + ^°{A C D). 

The inequality between the extreme sides shows that A C ft 0 . The 
first inequality with D replaced by A becomes 


M ° ⑻ $ E m°(A) 

while the reverse inequality is always true. 

Thus ^ 

M°(^) = E M 0 ⑷， 

and the proof is complete. 

Remark. Most frequently, a measure /x is given on a class 3D whose 
closure under finite summations or under countable summations is a 
field 6. Then the requirement of cr-additivity determines the unique 
extension of jjl on Q. 
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We are now in a position to prove the extension theorem. 

1° For every ^ C 6 and every D there is, for every e > 0, a covering 
\Aj\ in 6 of D such that 

M 0 ⑼ + 6^E m ( 乂 ）= E + E ^ 〆_ + y.\A c D). 

Thus, A 〔 Gi 0 and, hence, since the field Q is contained in the or-field 
a° y the minimal <r-field ft over Q is contained in d°. It follows, according 
to a and b, that the contraction on ft of the measure 〆 on (X 。 is an ex¬ 
tension of 只 to a measure on GL. This proves the first part of the theorem• 
2° Let ju on 6 be finite, let mi and M 2 be two extensions of jx to meas¬ 
ures on ft, and let 3TI C ： ft be the class on which and ^2 agree. Since 
belongs to 6, /xi (S2)= 叫⑴ ) ==m ( 卩 ) < 00 ; hence jui and M 2 are finite. 
Since 911 contains Q and, for every monotone sequence A n G 911 ， 

Mi(lim J n ) = lim ^\{A n ) = lim M 2 (^n) = A n ) y 

9TI is a monotone class. It follows, by 1.6A，that 9TI contains the mini¬ 
mal or-field d over the field Q and, therefore, m and M 2 agree on d. 

Let now /x on 6 be or-finite so that there is a countable class {^j} C 6 
with \iAj finite which covers Q. Thus, the foregoing result applies to 
every subspace and the second part of the theorem follows. 

Generalization. The extension theorem is valid for or-finite signed meas¬ 
ures = 〆 一 〆’• Extend 〆 and 〆’ and observe that 2° applies with 
<p instead of 

Completion. Given a measure ju on a (r-field ft, it is always possible 
to extend m to a larger (j-field obtained as follows : For every A 
and an arbitrary subset N of 3. null set of (2, that is, a set of measure 
zero, set jjl(^ U N) == Clearly, the class of all sets A \J N \s 

<T-field 3 ft and /X on is an extension of /x to a measure on 
(J M is called the completion of ft for /x and ju on (X M is called a complete 
measure. It is easily seen that C ： d° y so that the extension theorem 
provides us automatically with extensions to complete measures. 

4.2 Product probabilities. A measure on a class containing the space 
is called a normed measure or a probability when its value for the whole 
space is one; we reserve the symbol P y with or without affixes, for such 
measures. 

Let GLty Pt)y / G 7", be probability spaces^ that is, triplets consist¬ 
ing of a space of points a or-field of measurable sets A t (with or 
without superscripts) in Q ty and a probability P t on Let Qt be the 

class of all measurable cylinders of the form JI ^ X II ^ in the 

* t G T n t CT 一 T n 

product measurable space (JI 仏 ， II The class (Br of all finite 
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sums of these cylinders is a field, and the minimal or-field Qt over (By 
is, by definition, the product (r-field II ^ The product probability 
Pt = IT Pt on the class Qt is defined by assigning to every interval 
cylinder the product of the probabilities of its sides: in symbols, 


W x U ' K PlAt - 


Clearly, Pt^It = 1 and Pt on Qt is finitely additive and determines 
its extension to a finitely additive set function Pt on (B^. The defining 
term “product-probability” is justified by the following theorem (An¬ 
dersen and Jessen). 


A. Product probability theorem. The product probability Pt on 
(S^r ^ <s-additive and determines its extension to a probability Pt on the 
product ex-field dr* 

Thus, the triplet (S2r> Pt) is a probability space, to be called the 
product probability space. 

Proof. 1° On account of the extension theorem, it suffices to prove 
that Pt on (S>r is or-additive. Since it is obviously finitely additive on 
(Br, on account of the continuity theorem for additive set functions it 
suffices to prove that Pt on (S>t is continuous at 0. Ab contrario y given 
€ > 0 arbitrarily close to 0, it suffices to prove that, for every nonin¬ 
creasing sequence of measurable cylinders A n i A with PrA n > 6 for 
every w, the limit set A is not empty. Since every cylinder A n depends 
only upon a finite subset of indices, the set of all indices involved in 
defining the sequence A n is countable. By interchanging, if necessary, 
the indices, we can restrict ourselves to the product space - =Ilfin 
and sets A n = D n Y, Q,' n with D n C^i X • • • = fin+lXfin+2X • • 

If the set of all indices is finite, then there is an integer N such that, 
for every n y all the factors which follow the iVth one reduce to and 
the argument below applies with corresponding modifications. 

2° Let P f \ y P f 2 y • • • be the set functions defined on the fields (B /， 
(B’ 2 ， … of all measurable cylinders in fl’ 2 ，•••, as Pr is defined on 
Let ^ n (coj), ^ n (o)\ y o ； 2)，• • • be the sections of A n at coi C (c^i, 
C02) G fiiX ^2) etc. Clearly, ^/ n (coi) G ®’i. It is easily seen that, if 
B\ n is the set of all such that 

P\A n {o)\) > 
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then 


and, hence, 


PiB x n + ^(1 - P x B x n ) ^ P T A n > 6 

W > k 


Since A n J, implies that B\ n J,, it follows that, for B\ = lim B\ n y P\B\ 
^ 2 - Thus, Bi is not empty; hence, there is a point C common 

to all Bi n and, for every n y P\{A n {o){)) > ■- . The same argument ap¬ 


plied to yf n (coi) i yields a point cb 2 G ^2 such that co 2 )) > 

and so on. Therefore, the point d) = (d)i, 0 ) 2 , •••) is common to all 
A n y so that the limit set A is not empty, and the proof is complete. 

We pass now to Borel spaces. 

4.3 Consistent probabilities on Borel fields. We introduce the fol¬ 
lowing terminology. The set R == ( 一 + 00 ) of all finite numbers x 
is a real line ，the minimal d-field over the class of all intervals is the 
Borel field (B in R y the elements of (B are Borel sets in R y and the measur¬ 
able space (R y (B) is a Borel line. Similarly, the product space Rr = 
XX Rt y where every R t is a real line with points x ty is a real space with 
points xt = (x t ) y the product (T-field = II where every (S> t is 
the Borel field in R ty is the Borel field in Rt whose elements are Borel 
sets in Rt ， and the measurable space (Rt, (Br) is a Borel space. If T 
is a finite set, we say that Rt is a finite product space. Cylinders with 
Borel bases are Borel cylinders and, clearly, the Borel field (S>t is the 
minimal (r-field over the class of all Borel cylinders or, equivalently, 
over the class of all cylinders whose bases are product Borel sets. 

Given a finite measure on (Rt we can assume, by dividing it by its 
value for Rt ，that it is a probability Pt- Let TV = [t u • • • be a 
finite subset of indices and let {Rt n , (S>t n ) be the corresponding Borel 
space. We define on (S>t n the marginal probability Pr N y or projection 
of P on Rr N y by assigning to every Borel set Bt n in Rt n the measure 
of the cylinder with basis Br N ； in symbols 


P ^ X R f 穴 ’tv = IT 及卜 

t C 

Marginal probabilities are consistent in the following sense. If R ; and 
R” are two finite product subspaces of Rt n with marginal measures P f 
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and P", respectively, then the projections of P' and P" on their com¬ 
mon subspace, if any, coincide (with the projection of Pt on this sub¬ 
space). We want to prove that the converse is true (Daniell, Kol¬ 
mogorov). 

A. Consistency theorem. Consistent probabilities P Tn on Bore l 
fields of all finite product subspaces Rt n of Rt determine a probability Pt 
on the Bor el field in Rt such that every Pt n is the projection of Pt on R Tn . 

Proof. To every Borel cylinder with Borel base B Tn in R Tn we as¬ 
sign the probability value 

Pt(B Tn X R， Tn ) = P Tn {B T n ). 

It is easily seen that Pt on the class Qt of all Borel cylinders is finitely 
additive, and the theorem will follow from the extension theorem if we 
prove that Pt on Qt is continuous at 0. 

As in the proof of the product probability theorem, it suffices to 
prove that, given e > 0 arbitrarily close to zero, if a sequence A n [ A 
of Borel cylinders with bases B n formed by finite sums of intervals in 
Ri X • • • X is such that, for every », 

= P\ 2 - • -niSn) > e, 

then A is not empty. To simplify the writing, set P = Pt and P n = 
P 12 ... n . Since P n is bounded and continuous from below, in every in¬ 
terval in Ri X ... X we can find a bounded closed interval whose 
P n -measure is as close as we wish to that of the original interval. There¬ 
fore, in every B n , we can find a bounded closed Borel set B' n — formed 
by a finite sum of bounded closed intervals — such that P n (B n — B' n ) 

< an d ， hence, if A' n is the Borel cylinder with basis B’ n , then 

P{A n - A'n) = P n {B n - B'n) < ~ 

It follows, setting C„ = A\ D ... 11 A' ny that P{A n — C„) < ^ or, since 
Cn C ^ n 

P{C n ) > P{A n ) — - > 

Thus every C n is nonempty and we can select in it a point x (n) = (xi (n) , 
X 2 (n) , •••). It follows from Ci Z) C 2 Z) ••- that for every p = 0 ， 1 ， 
… ， x (n -^ p) CC n czJ f n and hence « ( 71 打)， …， ^ n (n+p) ) G B f n . 
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Since every B' n is bounded, we can select a subsequence of inte¬ 
gers such that (nit) —> as ^ > oo, then within it a subsequence 

such that X 2 ( ' Tl2le) —> ^ 2 » and so on. The diagonal subsequence of 
points x inkk) = (Xi (nkk) , x 2 (nkk \ •••) converges to the point x = (^i, x 2y 

…） and (^i (ntt) 3l •••, ^r» (nii) ) (^i, x m ) C B' m for every m. 

00 

Therefore, x C C ： A m whatever be m so that ^ C fj Am. Thus 

1 

this intersection is not empty, and the assertion is proved. 

Extensions. The foregoing theorem can be extended, as follows: 
Let dn be the <r-:field of Borel cylinders with bases in Ri X... X R ni 
and let Q,„ be the Borel field in R n . 

1° If uniformly bounded measures n n on form a nondecreasing 
sequence，in the sense that \i n ^n ^ Mn+i^n S ... and hence y. v A n f y.A n 
as p <x> whatever be n and A n C ®n, then n extends to a bounded meas¬ 
ure on 


The proof reduces to the previous one as follows. The set function fi 
so defined on the field U of all Borel cylinders in JJ R n is, clearly, 
finitely additive and bounded. Therefore, it suffices to prove that on 
this field m is continuous at 0. Given e > 0 and A n C Ct n , we can find 

p sufficiently large so that \i v A n + > y-A n . Then we can select 

a Borel cylinder A' n C A n whose basis is a closed and bounded Borel 

set in X • • • X R n such that ii p {A n — /’„) < . It follows that 


\xA' n 


^+T -〜人 + 〉 ^ An 


€ 

so that ii{A n — J’n) < ^n+i' From here on, the end of the preceding 
proof applies word for word. 


If <p n on ft n , n = 1, 2, •. .，are such that <p n (^n) = <Pn+i{^n) = ...， 
A n C we say that the <p n are consistent. 

2° If the uniformly bounded a-additive set functions <p n on Q n are con¬ 
sist ent^ hence <p p (^ n ) —* <p(^n) as p <x> whatever be n and A n C 
then <p extends to a a-additive bounded set function on ft*. 

The assertion follows from what precedes. For, clearly, the total varia¬ 
tions <p n on Ct n form a nondecreasing bounded sequence on [J & n , in 
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the sense of 1°. Hence lim <p n is continuous at 0 on (J d n and, a fortiori^ 
so is ip. Now use Jordan decomposition and generalization in 4,1. 

4.4 Lebesgue-Stieltjes measures and distribution functions. Com¬ 
plete measures on the Bor el field in a real line R = (一 °° ， +°o) did, and 
still do, play a prominent role. However, being set functions, they are 
not easy to handle with the tools of classical analysis, for methods of 
analysis were developed to deal primarily with finite point functions on 
R. It is, therefore, of the greatest methodological importance to es¬ 
tablish a. link between the modern notion of measure and the classical 
notions. This will be done by showing that there is a class of point 
functions on R which can be placed in a one-to-one correspondence with 
a very wide class of measures. In this manner, investigations of meas¬ 
ures (and, thereafter, of integrals) will be reduced to investigations of 
the corresponding point functions and, thus, the familiar methods of 
analysis will apply. Whatever be these point functions they will be 
said to represent the corresponding measure. 

Among possible representations of measures there are two which are 
fundamental: “distribution functions” which represent measures as¬ 
signing finite values to finite intervals, to be called Lebesgue-Stieltjes 
(L.S.) measures^ that we shall introduce now, and ‘‘characteristic func- 
tions” which represent the subclass of finite Lebesgue-Stieltjes measures 
required in connection with probability problems — that we shall in¬ 
troduce in Part II. Let (B be the Borel field in R and let /x be a Lebesgue- 
Stieltjes measure. The completion of (B for /x will be denoted by (B M , 
and called a Lebesgue-Stieltjes field in R y and its elements will be called 
Lebesgue-Stieltjes sets in R. 

A function on R which is finite, nondecreasing, and continuous from 
the left is called a distribution junction (d.f.)- Two d.f.’s will be said 
to be equivalent if they differ by some fixed but arbitrary constant. 
This notion of equivalence has the usual properties of equivalence*—it 
is reflexive, transitive, and symmetric. Thus, the class of all d.f/s 
splits into equivalence classes. As the correspondence theorem below 
(Lebesg'ae, Radon) shows, the one-to-one correspondence between L.S.- 
measures and d.f/s is not a correspondence between L.S.-measures and 
individual d.f/s but a correspondence between L.S.-measures and classes 
of equivalent d.f/s, each class to be represented by one of its elements, 
arbitrarily chosen. 

Let F y with or without affixes, denote a d.f. and define its increment 
function by 

F[a y b) = F{b) — F(a) y —oo < a ^ b < +°°. 
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Since two equivalent d.f.’s have the same increment function and con¬ 
versely, it follows that every class of equivalent d.f.’s is characterized 
by its increment function. Moreover, the defining properties of d.f.’s 
are equivalent to the following ： 

(i) 0 ^ F[a y b) < oo, (ii) b) ^ 0 zs a \ b, 

and 

ft n 一 1 

(iii) [ F\a ki b^) + 23 F[b ki a k+l ) = F[a u b^) 

左 =1 

where a < a\ 5 ； b x ^ ^2 ^ ^ ^ K are arbitrary, 

A. Correspondence theorem. The relation 

fi[a y b) = F[a y b) y —00 < a ^ b < +00 

establishes a one-to-one correspondence between L.S.-measures /i and d.f. y s F 
defined up to an equivalence. 

Proof. Let (B/ Ibe the class of all intervals [a y i) y —co < a < i < + <x>. 
ffij is closed under formation of finite intersections. The minimal field 
(Bo over (B/ is the class of all finite sums of elements of and of intervals 
of the form ( —oo ：l a) y [b + x), and the minimal or-field over ®o is the 
Borel field (B. 

The proof of the correspondence theorem is summarized by the dia¬ 
gram below, where c represents an arbitrary constant: 

F c on R ^ 11 on (?ji ㈡ ju on (B 0 <=» /x on (B <=» /x on (B M . 

1° fx on F c on R. For，ju on (B M determines its restric¬ 

tion to (S>i and, from properties of L.S.-measures it follows that the 
relation 

F[a, b) = n[a y b) 

determines an increment function with properties (i), (ii), and (iii) 
given above. 

2° ju on (B 0 ==> m on For, R being a denumerable sum of finite 
intervals, the measure ju on (B 0 is cr-fmite and the extension theorem 
applies followed by completion. 

3° on (S>j ^ on (Bq. It suffices to prove that if A = Ik 

k 

G®o, Ik G ®/, then \l{A) is determined by the or-additivity requirement 
p.{A) = that is, if A can also be written as XI A'» where Fj 
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G ®/, then 2Z = 2Z MD. Since n on ©/ is additive and 
i k 

I'i = AFj = E hi' h Ik = Ah = T. rjh, 

k j 

it follows that 

E M(/'y) = EE Khrj) = EE = t m ⑹， 

j j k k j k 

and the assertion is proved. 

4° F c n on ©/. We have to prove that the relation n[a f b )= 
F\ a i b) determines a measure n on OJ/, that is, if / = JZ /„， where / = 
[a, b) and I n = [a n , b n ), then H fil n - By interchanging, if neces¬ 

sary, the subscripts, we can assume that, for every n, 

a S S S S a n S b n S b. 

It follows that 

w n n n — 1 

E K^k) = E F[a k) b k ) ^ 2Z FWky h) + E a k+l ) 

k=s\ k=l k=X A;=b=1 

= F[a u b n ) ^ F[a y b)= 〆 /)， 

and, letting w —> oo, we get E ^ mCO. 

It remains for us to prove the reverse inequality. We exclude the 
trivial case a = b 、select e > 0 such that t < b — a and set I* = [a, 
b — e]. Because of the continuity from the left, for every n there is an 

«n > 0 such that F[a n - e n , a n ) < ^ . If In = (a n - e n , b n ) } then, 

from /* C U In it follows, by the Heine-Borel lemma, that there is an 

n o 

n 0 finite such that /* C (J /〆.Let k x ^ n 0 be such that a C Ik* 

A:* 1 

and, if b kl < b, then let k 2 ^ n 0 be such that b kl C Continue in 
this manner until some bk m lb— e — the process necessarily stops for 
some m n 0 . Omitting intervals that were not selected and, if neces- 

m 

sary, changing the subscripts, it follows that / e C [J Ik and 

A;=l 


for 


ai — ei < a < b u ak^x — « a：+i < < ^k+i 

k = 1, 2, • • • ^ - 1, a m - t m < b - t H 
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Therefore, 

F[ a i b - t) F[a x — t u b m ) = F[a x — e 1} b x ) + JZ F{b k ， b k+ \) 

m oo k==1 

=IZ F[ a k — 6k, h) ^ IZ F[ a ky ^k) + « 

左 =1 

and, letting e — 0, 

F l a > b 、m F[a ny b n ) y that is, m(/) ^ E 

which completes the proof of the final assertion and, hence, of the cor¬ 
respondence theorem. 

Particular case. If F is defined, up to an additive constant, by 
F(x) = x ^2 R y then the corresponding measure of an interval is its 
“length •” The extension of “length” to a measure /x on (B and the 
completed measure ju on (B M are called Lebesgue measure on (B or © M , 
respectively, and will be called Lebesgue field. The Lebesgue meas¬ 
ure is at the root of the general notion of measure. 

Remark. We can define a L.S.-measure on the Borel field ®-mini- 
mal (r-field over the class of all intervals in 及 =[—oo, + 00 ] and, hence, 
on by adjoining to a L.S.-measure on (B, arbitrary measures for the 
sets \ —°o} and {+oo}. 

Extension. The preceding definitions, proofs, and results, remain 
valid, word for word, if Borel lines are replaced by finite-dimensional 
Borel spaces R N = X • • • X provided the following interpreta¬ 
tion of symbols is used: a y b y x y • • • are points in R N ’ say, a = {a\ y •…， 
a < K a ^ means that S 〜•）for 走 = 1， • •.， 见 F 

on R n is a function with values F(a) = F(ai, • • •， a^) and increments 
F[a y b) are defined by 

F\u^ B) = A 5 _ qJF{(1) — 匕 hN — ⑽ ^n) 

where, for every k y L hk — ak denotes the difference operator of step 
acting on a^. For instance, if N = 2 y 

A b - a F(a) = △ 6 l _ ai U(a 1 , a 2 ) = 》 2 ) - F(a u a 2 )\ 

= b-i) — F{ax, ^ 2 ) — F{bi y a 2 ) 4 - F(a iy a 2 ) 

and, in particular, if F{a\’ a 2 ) = a^ 2 is the area of the rectangle with 
sides 0 to a x and 0 to a 2y then ^\ ) ^ a F{d) = {b\ — ai) ( 彡 2 — A) is the 
area of the rectangle with sides a\ to b\ and ^2 to 彡 2. 

The defining properties of a d.f. F on R N become: 

-00 < F < + 00 , F[a y b) = A b ^ a F(a) ^ 0, F[a y 幻 — 0 

as a I b y that is, a\ | b\ y • % • y aN T 
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Product-d.f.'s and product-measures. A very important particular case 

is that of product-dj's: 

* N 

* - - > ^.v) = IT Fk(xk), x k C Rh 

左 si 

where the Fk oh Rk are d.f.’s. Then F on R N is a d.f., for, 

N 

A b ^ a F(a) = IX ^b k ~a k F k (a h ) ^ 0 

A:=l 


and the other defining properties are clearly satisfied. 

Every d.f. Fk determines a measure /x* on the Borel field in Rk, by- 
means of the relation Hk[ak, b^) = Fk[^k )〜)，and the measure /x on the 
product Borel field determined by means of the relation fx[a y b) = F[a y 々)， 

N ' 

is clearly the product-measure JJ /x*. 

Let now F n be d.f/s with 心(+°°) — F n ( — oo) = 1， so that the meas¬ 
ures Mn are probabilities. Then, by the product-probability theorem 
or by the consistency theorem, 

B. A sequence F n of dJJs corresponding to probabilities on R n deter¬ 
mines a product - probability on the Borel field in the product space JI R n . 


This result extends at once to any set {F ty t C T} y of such d.f/s. 


COMPLEMENTS AND DETAILS 

In one guise or another, and especially when they are indefinite integrals, 
signed measures on a fixed <r-field are in constant use in measure theory and 
probability theory. Many of the properties established in this book are but 
properties of such set functions. 

Notation. The measurable sets belong to a fixed cr-field on which the set 
functions and limits of their sequences are defined. Unless otherwise stated 
and with or without affixes, A y B y • • • denote sets, )u denotes a measure, <p de¬ 
notes a signed measure. 

/. If <p is cr-finite, then there are only countably many disjoint sets for which 
p 〆 0 in every class. 

2. For every A there exists z B Cl A such that ^ 2| <p(B) |. 

3. If <pi ^ <p 2 9 then ^i + ^ ^ <P 2 ^- If <p = <pi dz <p 2 y then ^ ^ 

Pl 士 + 中 2 土 . 

4. Minimality of the Jordan-Hahn decomposition. If p = /x+ — /x"~, then 

p 士刍 士. 

We say that A is a <p-null set 9 if p = 0 on \AA\ A ( We say that A 

and B are ip-equivalent^ if they coincide up to a p-null set. We say that a non¬ 
empty set is a <p-atom y if every measurable subset of A is ^-equivalent either 
to 0 or to A. 
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5. The ^-null sets form a <r-ring; the 史 -null sets of ip and of ip are the same. 

The ^-equivalence is an equivalence relation (reflexive, transitive, and sym¬ 
metric), and GL splits into ^-equivalence classes. 

(5. Every p-null set and every measurable set consisting of one point is a 
p-atom; <p{A) = I <p{A) \ for every ^-atom A. Atoms of ip and <p are the same; 
atoms of <p are atoms of <p + and p 一 ， but the converse is not necessarily true. 

If is a 史 -atom, then ^ = 0 or <p{A) on A fl GL; if <p is finite, then the converse 
is true. What if <p is ^finite? What about <p — co except for 0? 

7 . If m is finite, then ^ ^ Aj + A where the Aj or A may be absent but, 

if present, then the Aj are ju-atoms of positive measure and, for every B d A 
of positive measure, m takes every value c between 0 and /xB for measurable sub¬ 
sets of B. This decomposition of is determined up to /x-null sets. Can /x be 
replaced by <p? 

(There is only a countable number of ;x-equivalence classes of such AjS. 
Select representatives Aj of these classes and let 5 C = fl — Select 

inductively sets C n G 6 n such that fxC n > sup fiC - for all C G 6 n , where 

Tl 

6 n is the class of all C C 5 — (Ci U C 2 ••- U C n -i) for which fiC ^ c — 
fi(Ci U C 2 U . •. U C n -i). Then fiC = r for C = U C n .) 

8. If <p is finitely additive, ju is finite, and \iA n —► 0 implies <pA n —► 0, thea 
<p is (r-additive. 

We say that <p is <p^continuous if <pqA = 0 implies <pA = 0. 

9. If \iA n —► 0 implies <pA n — 0{<pA n —► 0), then <p is ju-continuous. If <p 
is finite, then the converse is true. 

(Assume the contrary of the converse; there exist € > 0 and A n such that 
\iA n < ^ and <pA n ^ t. Then fiB = 0 and <pB ^ € for 5 = lim sup A n ) 

What if <p is cr-finite? What about GL consisting of all subsets of a denumer¬ 
able space of points and ju{co n } = —, <p{(^n\ = n. What about fx replaced by 

10. If the are finite measures, then there exists a /x such that all the /x/ 

are ^-continuous. (Take ju = What about /x/’s replaced by <p/s ? 

Let (B C 汉 be a cr-field such that the measurable subsets of elements of (B 
belong to (B. Let ®(p) be the class of sets such that their subsets which belong 
to (B are ^-null. Call the sets of ® “singular,” and the sets of “regular.” 
Call <p regular (singular) if every singular (regular) set is p-null. 

Let <p r = <p r + — <Pr^ y <p$ — <Ps+ — 9 厂 ， defined by 

= sup <p 士 (B) for all regular B d 
ip^A) = sup (p 土 (B) for all singular B Cl A, 

11. Decomposition theorem. <p r is regular, <p 9 is singular, and <p = <p r + <Ps* 
If <p is finite, then the decomposition of <p into a regular and a singular part is 
unique. What if <p is cr-finite? What if GL consists of all subsets of a noncount- 
able space, and <p(A) equals the number of points of A} (Proceed as follows: 

(i) (R(<p) = ® ( 朵） =® ( 妒 +) 门 （ B(p -） is a cr-field. 

(ii) <p r (<p») is a regular (singular) signed measure. 
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(iii) Every A contains disjoint A r regular and A 9 singular such that ip^A) 

= fH/ir), <P ， H^) = ( 

(iv) A = A f r + A’ 9 with A f r regular and singular, then we can take 
A r = A 1 r and A 9 = A’•• 

(v) If <p is finite, every A can be so decomposed.) 

12. We can take for singular sets: 


(i) the ju-null sets — regular (singular) becomes /x-continuous (/i-discon- 
tinuous); 

(ii) the countable measurable sets — regular (singular) becomes continuous 
(purely discontinuous); 

(iii) the countable sums of atoms — regular (singular) becomes nonatomic 
(atomic). 

In each case investigate the regular and singular parts. 

13. Intermediate-value theorem (compare with continuous function on a con¬ 
nected set). If A is nonatomic and A n \ A with <pA n finite, then <p takes 
every value between —<p^A and for measurable subsets in A. (See 7) 

What if GL consists of all sets in a noncountable space, = 0 or oo according 
as A is countable or not? 


In what follows, the <p n are cr~additive but, unless otherwise stated, lim <p n 
is not assumed to be cr~additive. 

14. If (p n — (p (r-additive, then <p 士 S lim inf If, moreover, <p n t or 

<p n i , then #士 = lim <p n ± * 

15. If pn T ( i ) an d <Pi > —°°(< + 00 ), then <p n <p cr-additive. 

16. If <p n <p uniformly on GL and <p > —qo or <p < +co, then <p iscr-additive, 

17• To a measure space (Q, d, /x) associate a complete metric space (9C, df) 

as follows: 9C is the space of all sets A y B of finite measure, J is a metric defined 
by d{A y B) = fi(AB e + A C E). Prove that the metric space is complete. 

(If A n is a mutually convergent sequence in 9C, then the sequence Ia h mutually 
converges in measure and hence converges in measure — see 6.3.) 

If ^ on (i is a finite ju-continuous measure, then v is defined and continuous 
on (9C, d). 

We say that the <p n are uniformly continuous if —► 0 implies <p n A m —► 0 

uniformly in 打 ， as m 

18. Let m be cr-finite. If the finite <p n are /x-continuous and lim <p n exists and is 
finite, then the <p n are uniformly /x-continuous and lim <p n = <p is /x-CQntinuous 

门 D 「/ C 9C; I <p m A — ip n A I 


and <r-additive. (For every € > 0, set Ah 




• By (/7), every Ak is closed. By Baire’s category theorem, there exists 

ko y do and 為 C 沉 such that [A^_ 9C; d{A y A^) < do] C Let 0 < 8o < do 
such that I ip n A | < e whenever \iA < 8q and n ^ ^o. If < 5o, then 
— A 、 Aq) < ^/o, U Aq) do 9 and | \ ^ | \ + 

I <Pn (為 U ^f) — <Pk o (^0 U y/) I + I <Pn(^0 — A) — (為一 /f) I.) 

19. \i finite <p n — <p finite, then ip is cr-additive. (If \ <p n \ ^ c n> set 

= 2 2^7" I I an< ^ a ppiy 从） 
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§ 5. MEASURABLE FUNCTIONS 

5.1 Numbers. Spaces built with numbers are prototypes of all 
spaces, and functions whose values are numbers are prototypes of all 
functions. 

By a number x we mean either a usual real number ― -finite number — 
or one of the symbols +<» and —°o — infinite numbers. These symbols 
are defined by the following properties: 

-oo ^ X ^ +oo, 

X • 

士 00 = (士 oo ) + ^ ^ 4 - (士 °°)， - = 0 if —oo < ^ < +00， 

士 00 


• V (士 00) = (士 00)% = 


士 00 
0 

- 干00 


if 0 < ^ ^ +00 

if x = Q 

if — oo ^ < 0. 


The expression +oo — oo is meaningless, so that, when speaking of a 
“sum” of two numbers, we assume that, if one of them is 干 the other 
one is not 士 oo; then the sum exists. 

The reason for the introduction of infinite numbers lies in the fact 
that, then, sup x t and inf x t = — sup ( — x t ) y where / varies over an arbi¬ 
trary set T y always; exist (but may be infinite). Moreover, if inclusion, 
union, and intersection of numbers are defined by ^ ^ jy, sup x t and 
inf xt respectively, then these operations have properties of the corre¬ 
sponding set operations; in particular, limits of monotone sequences of 
numbers always exist, but may be infinite. 
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If, as 打 —the limit ^ of a sequence x n of numbers exists, we write 
•v = lim x n or x n ^ x and say that x n converges to x; if ^ is infinite, say, 
+ 00 , one also says that diverges to +oo. The Cauchy mutual conver¬ 
gence criterion is valid only for finite limits: x n converges to some finite x 
if y and only if ， x m — x n 0 (as m y n oo) or y equivalently y if x n ^ v — 
x n 0 uniformly in v ， On the other hand, the Bolzano-Weierstrass 
lemma remains valid without the usual restriction of boundedness: 
every sequence of numbers is compact^ that is, contains a convergent sub¬ 
sequence, but if the sequence is not bounded then the limits may be 
infinite. 

The set of all finite numbers is a real line R = (―°0 , +qo) and the 
set of all numbers is an extended real line R = [ —°o, +oo]. The basic 
class of sets in R is the class of intervals; there are four types of finite 
intervals of respective form： 


[a y b)\ set of all points x such that a ^ x < b 

(a y b ]： set of all points x such that a < x ^ b 

(a y b): set of all points x such that a < x < b 


[a, b]: set of all points x such that a ^ x ^ b. 


The minimal cr-field over the class of all intervals in R is the Bor el field 
in R and its elements are Bore/ sets in R. The Borel field in R coincides 
with the minimal onfield over the subclass of all intervals of one of the 


foregoing four types, since countable operations performed upon ele¬ 
ments of one of these subclasses yield any element of the other sub¬ 


classes; for example, (a y ^) = (J ^ + 

etc. Similarly, the Borel field in R is the minimal cr-field over the sub¬ 
class of all infinite intervals of the form ( —oo, — oo ^ ^ +oo, since 
any finite interval [a } b) is obtainable as a difference Afe_ a ( —oo, a )= 
(—oo, b) — (—oo, a). The Borel field in R can be defined similarly by 
means of any of the foregoing types where —oo ^ a ^ b ^ +oo, or by 
means of the intervals [ —°°, x) y —oo ^ x ^ +oo； but, frequently, the 
most convenient way is to take the minimal cr-field over the class formed 
by the Borel field in R and the two sets { —°o}, { + 00 }. 

Extension. The preceding notions extend at once to finite-dimensional 
real spaces. The set of all ordered A^-uples x = (x u …， xn) of finite 


n 


，古）， [a, 々]= 门 




numbers is the N-dimensional real space R N or, equivalently, the prod- 
N ' 


uct space JJ R v of N real lines R v = ( —qo < x v < +<»). If every R y is 
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replaced by R v = [ —°o ^ x v ^ +°°】， then we. have the extended N-di- 
mensional real space R N . If a, l? C R N ， then a ^ b means that a v ^ b v 
for v = 1 , 2 , . •. ， N ， and, similarly, ior a < a = b. 

An interval, say [a, b) t will also be written more explicitly as [a^ <? 2 > 
• • •, a N ; b Xi b 2i • • •, b N ), and 

l a > = A 6 _ a (-oo, a) = A bl _ ai A bi _ ai - • • 

匕 b N -a N { — 的， —oo, … —oo ； a u a 2 , ••- a N ) 

where A 6>i _ ai , is the difference operator of step b v — a v acting on a v . 
For example, if = 2, then 

[a\ y a<i\ b\^ b<i) = △b 1 _ ai Ab 2 _ a2 ( — <»， — qo ； a<i) 

=△6 1 - ai {( — =0, —oo ； a u b 2 ) — ( — °o, — °o; a u a 2 )} 

=( — qo, —oo ； b-^) — ( — oo, —oo ； ai, b 2 ) — ( — oo, —oo ； 

hi a 2) + ( —°°, —°°； a\ y a 2)- 

With this interpretation, the foregoing definitions of types of intervals 
and, thereafter, of Borel fields, remain the same. 

5.2 N umericaljfunctions. A numerical Junction X on a space n is a 
function on to R, defined by assigning to every point co C a single 
number x = X(cS) } the value of X at co. If infinite values are excluded, 
X is finite Junction or, equivalently, a function on to is called 

the domain of X and R (or R) is called the range space of X. The func¬ 
tions X + = XI[ X ^o] and X~ = —XIy X <o) will be called the positive 
part and the negative part of X, respectively, and we have 

x = - z- |X| = X + + X~. 

Unless otherwise stated, all functions will be numerical functions and, 
in general, will be denoted by X, Y , …， with or without affixes. 

If definitions or relations between values of given functions hold for 
every co belonging to a set / Cl we say that these definitions or rela¬ 
tions hold on A and drop “on A"'" A = Q.. For example, 

I XI < oo means that X is finite; 

X on A means that X(oi) ^ 0 for every co C 

X = inf X n means that X{(a) = inf X n (<S) for every co C 

X n X on A means that X n (cJ) —> X{u>) for every co C etc. 

Conversely, the set of all co 〔 on which definitions or relations 
hold is denoted by [co; • • • ] or, if there is no confusion possible, by […] 
where … stand for the definitions or relations. For example, 
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[X] is the set on which X is defined; 

[X ^ Y] is the set of all o> ^ ^ for which X{u>) ^ y(co); 

[X G S], where S Cl R, \s the set of all co 〔 for which the values 
X(ui) belong to the set S. 

The set [X = is called the inverse image of the set {^} which con¬ 
sists of x only or, simply, of x. Since X is single-valued, the inverse 
images of distinct numbers x are disjoint, and the partition of into 
inverse images of all ^ C is called the partition of the domain in¬ 
duced by X; we sometimes write X = Yi xl^x~x] where I\x=x] is the in- 

x ch 

dicator of [X = ^]. In particular, if X is countably valued ，that is, takes 
only a countable number of values xj y then, and only then, 

x = T, XjI [X 

3 

More generally, the set [X C S] is called inverse image of S and is 
also denoted by X~^ x (S). The symbol X^ l y which can be considered 
as representing a mapping of sets in R onto sets in is called the in¬ 
verse function of X. Since inverse images of disjoint sets of R are dis¬ 
joint, it follows easily that 

X^ 1 and set operations commute: 

X- l (S - S f ) = X^ l (S) - 尤- 1 ⑺， X^ l (\J S t ) = U 尤 -1 ⑻， 

^ _1 (n s t ) = n x-^sd. 

Similarly, 1 (0) or the inverse image of <B, where 6 is a class of sets 
in R y is the class of all inverse images of elements of 6. Since set opera¬ 
tions commute with inverse functions, it follows that 

a. The inverse image of a a-field is a c-Jieldy the inverse image of the 
minimal cr-field over a class is the minimal a-field over the inverse image 
of the class、the class of all sets whose inverse images belong to a a-field. is a 
o' afield • 

The foregoing definitions and properties extend at once to functions 
X = (Xi, ... ， Xn) on to an AT-dimensional real space R N (or R N ) 
or, equivalently, to A^-uples of numerical functions X\ y … ， Xy. Classi¬ 
cal analysis is concerned with functions from a real line to a real line 
or, more generally, from a finite-dimensional real space R N to a finite¬ 
dimensional real space R N . Still more generally, let X be a function 
on Q to and let ^ be a function on to R N， . The Junction of June- 
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tion gX defined by {gX){u>) — g(X(oi)) is a function on to R N， . Clearly, 
its inverse function (gX)~ l is a mapping of sets S' in R N， onto sets in 

such that 

( 必 - 1 ⑺ = x-^g-^s^) 

or, in a condensed form, 

= 义 -if i. 

5.3 Measurable functions. Classical analysis is concerned primarily 
with continuous functions on R to R f or, more generally, on R N to 
R N - However, passages to the limit, which play such a basic role in 
analysis, do not, in general, preserve continuity (and also they cause 
the appearance of 干 qo). The essential achievement of modern analy¬ 
sis, due to Borel, Baire, and Lebesgue, is the introduction of a wider 
class of functions which is closed under the “usual” operations of analy¬ 
sis: arithmetic operations and formation of infima, suprema, and limits 
of sequences. Those are the functions we intend to define now. 

In the domain fi of our functions we select a cr-field Q, of sets, to be 
called 6,-seis or, if there is no confusion possible, measurable sets; the 
doublet (fl, Cfc) is called a measurable space. In the range space R of 
our functions_we select the cr-field © of Borel sets 一 the Borel field in 及； 
the doublet (R, (B) is an {extended) Borel line. Thus, our functions are 
defined on a measurable space (fi, a) to the Borel line (R, ®). More 
generally, if the range space is R N , then we select the Borel field 
and the doublet (R N } ^> N ) is an extended Borel space; then the functions 
are defined on a measurable space (0, Cfc) to the Borel space (R N y 

A countably valued function X = ^ x } I A . where the sets Aj are 
measurable is called an elementary measurable function or, simply, an 
elementary junction; if the number of distihct values of X is finite, then 
X is also called a simple junction. 

(C) Limits of convergent sequences of simple junctions are called meas¬ 
urable junctions. 

This is a constructive definition and, because of that, will play an es¬ 
sential role in the constructive definition of integrals. However, gen¬ 
eral properties of measurable functions are easier to discover and to 
prove when using the descriptive definition which follows. 

(D) Functions such that inverse images of all Borel sets are measurable 
sets are called measurable Junctions. 

Yet this definition is not the most economical one, since 
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(DQ In (D), it suffices to require measurability of inverse images of 
elements of any fixed class Q such that the minimal <r -field over Q is the 
Borel field. 

For example, we can take 6 to be the class of all intervals, or the class 
of all intervals [ —oo, etc. 

The proof is immediate. Since a mapping X~ l preserves all sets 
operations and the measurable sets form a <r-field, it follows that the 
class of all sets whose inverse images are measurable is a a-field. There¬ 
fore, if, according to (D’)，it contains 6 , then it contains the minimal 
tr-field over Q which, by assumption, is the Borel field. 

Similarly, the constructive definition (C) is not the most economical 
one as we shall find in proving the basic theorem below. 

A. Measurability theorem. The constructive and descriptive defi¬ 
nitions are equivalent, and the class of measurable junctions is closed under 
the usual operations of analysis. 

Proof. 1° Let X n be functions measurable (D), that is, measur¬ 
able according to (D) or, equivalently, (D’). Then all sets 

[inf ^ < a:] = U [^« < x], [~X n <x] = [X n > -x] 

are measurable and, hence, the functions 

sup X n = — inf (—X n ), lim inf X n = sup n (inf Xk), 

n 

lim sup X n = — lim inf ( — X n ) 

are measurable (D). Thus, the class of functions measurable (D) is 
closed under formation of infima, suprema, and limits. But every simple 
function X = is measurable (D), since all sets [X S = 

為 . are measurable. Therefore, limits of convergent sequences of 

simple functions are measurable (D); in particular, functions measur¬ 
able (C) are measurable (D). 

2 ° Conversely, let X be measurable (D) so that the functions 

n2 n & — 1 

Xn = + _S +1 ~r~ I L ! ^^ x 4'] + nI[X ^ n], 

« = 1， 2, • • • 

are simple. Since 

I -X^) |<i for I X^) I < n 

and 

X n (o>) = 士 ; 7 for X(oj) = ±oo, 
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it follows that X n —> X and this, together with what precedes, com¬ 
pletes the proof of the equivalence of the two definitions of measura¬ 
bility. 

We observe that if X is nonnegative, then the foregoing functions 
X n become 

n2 n ^ _ 2 

Xn = ^ l ~r rI i^^ x <¥] + nI[Xin] 

and we have 0 ^ Xn^ X. Also, if 

X ' n + + (+°o)/ [x=+M] , 

then I X r n — X\ < — on [\ X\ < oc] and X f n = X on [| JSf | = oo], so 
that X' n —> X uniformly. 

3 。 It remains to prove closure under the arithmetic operations. 
Using definition (C) and the fact that arithmetic operations commute 
with passages to the limit by convergent sequences, it suffices to show 
that the class of simple functions is closed under the arithmetic opera¬ 
tions. But much more is true, for if g on R N is an arbitrary function 

and Xk = 13 x kj^Akji ^ = 1, • • - y N, are simple (elementary) functions, 

j 

then the function of functions 

《 (^ 1 ，• • •，) = ^2 • • • ， IANj N 

is simple (elementary). This completes the proof. 

According to this proof we have new equivalent constructive definitions 
of measurable functions that we state now. 

(C ; ) A nonnegative Junction is measurable if it is the limit of a nonde¬ 
creasing sequence of nonnegative simple functions. A function X is meas¬ 
urable ij its positive and negative parts and X— are measurable. 

(C ;/ ) A function is measurable if it is the limit of a uniformly conver¬ 
gent sequence oj elementary junctions. In particular、every bounded measur¬ 
able function is limit oj a uniformly convergent sequence of simple functions • 

Definition (C’）will play a central role in the theory of integration. 
Closure under the arithmetic operations is a very particular case of 

a. A Baire junction oj measurable functions is measurable. 

Proof. Let us recall a (constructive) definition of Baire functions 
(we consider only finite-dimensional Borel spaces). Baire junctions are 
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elements of the smallest class closed under passages to the limit con¬ 
taining all continuous functions. Therefore, since the class of measur¬ 
able functions is closed under passages to the limit, it suffices to prove 
that 

A continuous function of measurable junctions is measurable. 

Thus, let g on be continuous; that is, for every point (^i, …, x^) 
CR N y ^ 

s( x， ly • . . ， x ， n) gi^u . . . ， x n) as x\ — > ^ 1 , * * •, X’N — Xn. 

Let Xkt k = 1,2, - • N y be measurable and let X n k be sequences of 
simple functions such that X n k ~> Xk for every k. We found (in 3°) 
that the functions g(X n u • • •, X n N) (that we assumed tacitly to have 
meaning) are measurable and hence, by continuity and closure under 
passages to the limit, the function 

g(X u • * Xn) = lim g(X n i y • • •, X u n) 

is measurable. This completes the proof. 

All the foregoing definitions and properties extend at once, and word 
for word, to functions on a measurable space to any finite-dimensional 
Borel space, provided we replace 及 by "R N and leave out the operations 
of multiplication and division that we do not define (at least here) for 
such functions. For example, 

junctions such that inverse images of Borel sets in their range space are 

measurable sets in their domain are called measurable junctions. 

This extension is useful but, in fact, brings nothing new, for 

b. A function X = (X\ y Xn) is measurable and only if y its 
components X\, • • •, Xn are measurable. 

In other words such a function is merely an iV-uple of numerical meas¬ 
urable functions. 

Proof. X = (X\ y .. .， Xn) is measurable, then, for every k ^ N y 
the sets 

[X k ^ x k ] = X k ~ 1 [-w i x k ] 

=• • •， + 00 , .. •）+ 00 , x ky + 00 , - - •, +°°] 
are measurable, so that Xk is measurable. 
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Conversely, if all Xk are measurable, then the sets 

N 

[X S X ]= [不 ^ • • •, Xn ^ Xn] = fl [Xk = 

A ； s=*l 

are measurable, so that X = {X u . . •， X N ) is measurable. 

We give another (descriptive) definition of Baire functions. With 
this definition, it is customary to call these functions Borel functions. 
A measurable function on a finite-dimensional Borel space to a finite¬ 
dimensional Borel space is called a Borelfunction. In other words, g on 
R n to R n '^\s a Borel function if, and only if, the inverse images of Borel 
sets S' in R N， are Borel sets S in R N . The proof of a in this more gen¬ 
eral case is then immediate and we have 

a 7 . A Borel Junction of a measurable function is measurable. 

For, if JY" is a measurable function (not necessarily numerical) and g is 
a Borel function on the range space of X y then, for every Borel set S’ 
in the range space of g, the set is measura¬ 

ble and, hence, gX is a measurable function. 

§ 6. MEASURE AND CONVERGENCES 

6.1 Definitions and general properties. The notions of “measur¬ 
able” sets and “measurable” functions are two out of a triplet of notions, 
due essentially to Lebesgue, the third being the notion of “measure” 
which gave its name to the two others, and which we shall introduce 
now. 

A function <p on a a-field ft is said to be a-additive if, for every counta¬ 
ble disjoint class {Aj) C Cfc, 

V>(H ^j) = Z) <P(0. 

To avoid trivialities, it is assumed that at least one value of <p y say, 
(P(Aq\ Aq C is finite. Since 

(為 + 0) = (p{A Q ) = (p(A 0 ) + <p(0 )， 

this assumption is equivalent to 沪 (0) = 0. To avoid meaningless ex¬ 
pressions of the form + 00 — °o, it is assumed that at least one of the 
possible values — °o or +oo is excluded. 

*p is said to be finite ifits values are finite, and it is said to be a-finite 
if the space in which ft is defined can be partitioned into a countable 
number of sets in Cfc for which the values of <p are finite. 
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A measure /x on a a-field a is a nonnegative and a-additive function. 
In other words, n is defined by the three following properties: 

(i) Aj) = Yi for every countable disjoint class {Aj) CZ a ； 

(ii) fi{A) ^ 0 for every A C. 

(iii) M (0) = 0. 

The value p.{A) of /x at y/ is called the measure of A and, if there is no 
confusion possible, we drop the bracket following the symbol /x- 

A measure space (S2, a, p) is formed by the space S2, the o--field (X of 
measurable sets in this space, and the measure n defined on this a-field. 
Unless otherwise stated all sets under consideration will be measurable 
sets in our measure space. A set of measure 0 is said to be a \x-null set 
or, if there is no confusion possible, a null set ， and definitions or relations 
valid outside a /x-null set are said to be valid almost everywhere (a.e.). 
The following properties of the measure n are immediate: 

a. n is nondecreasingy and (i is bounded ij the space S2 is of finite measure. 
This follows from 

\xB = nA + m(5 — A) ^ ^lA for B ZD A. 

b. n is sub a-additive: mU = 13 “j. 

This follows from 

mU = M (為 + A-2, +. • •) 

= \iA\ + M-^l C -^2 + • * * ^ M 為 + M-^2 + .... 

A. Sequences theorem. Ij A n \ A y then \iA n | \iA and y in general, 

lim inf \iA n ^ /x(lim inf A n ). 

If n is finite 、 then 、 moreover y 

丄 / implies \iA n i \iA y lim sup [iA n ^ /i(lim sup ^ n ), 

A n — A implies \iA n —> \xA. 

Proof. If A n T then, by <r-additivity, 

\i-A = }iA\ + M (為 _ 為 ） +... 

=lim \[iA\ + M (為 一為） + … + — ^n—l) } 

=lim \iA n . 
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If A n is an arbitrary sequence, then, since B n = f) ^ T lim inf A n and 

n 


\iA n ^ fiB ny it follows that 

lim inf \iA n ^ lim \iB n = /x(lim inf A^) y 
and the first assertion is proved. 

Let now n be finite and use the proved assertion. If A n | A y then 
A\ — A n \ A\ — A and, hence, 

nAi — \iA n = ti{Ai — A n ) T m(^i — A) = \iAi — \iA^ 

so that /x/n 丄 If A n is an arbitrary sequence, then /xS2 — lim sup nA n 
=lim inf \iAn ^ n(\im inf A n c ) = /xQ — /x(lim sup A^) and, hence, 
lim sup nA n ^ /i(lim sup A n 、. Finally, if A n —> A 、then the two in¬ 
equalities proved above yield \xA n — \iA y and the proof is complete. 


The introduction of measures yields new types of convergence founded 
upon the notion of measure and unknown in classical analysis. Before 
we introduce them, we recall the classical types of convergence; unless 
otherwise stated, we consider sequences X n of measurable functions on 
a fixed measure space (Q, Cfc, /x) and limits taken as « —> «j. 

If X n converges to JY" on ^ according to a definition “c” of conver- 

• C 

gence, we say that X n converges “c” on A and write X n X on A. 
The Cauchy convergence criterion leads to the corresponding notion of 
mutual convergence: if X n+V — X n converges “c” to 0 on ^ uniformly 
in v (or X m — X n converges “c” to 0 on // as w，” — oo)，we say that 

0 c 

X n mutually converges “c” on / and write Xn+v — X n 0 (or X m — 

C „ 

X n —> 0). In defining mutual convergence, we naturally must assume 
that the differences exist, that is, meaningless expressions + 00 — 00 do 
not occur. We drop “on A y ' \( A = Q, and drop “c” if the convergence 
is ordinary pointwise convergence. 

We recall that X n X on A means that, for every w G ^ and every 
c > 0, there is an integer n t , u such that, for n ^ n (t<ay 

if I(w) is finite, then | X(ui) — X n (o)) | < c, 

. 1 
if X((S) = —oo, then I n (w) < - » 


if X(u) = + 00 , then X n (cS) > + 
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n. 


is independent of w G then the convergence is uni¬ 
form and, according to the preceding conventions, we write X n X 
on A. According to the closure property of measurable functions, if a 
sequence of measurable functions X n — X 、 then X is measurable. 
According to the Cauchy criterion, if X n are finite, then 


X n — > X finite if, and only i 
Xn^v — X n ― > 0 

X n — X finite if, and only if, X m 

V "XT' U 

_ ^ 0 . 


X m — X n 0 or, equivalently 


u , 

X n —> 0 or, equivalently, 


6.2 Convergence almost everywhere. A sequence X n is said to 
converge a.e. to X, and we write X n —> X y if X n X outside a 

a.c. 

null set; it mutually converges a.e., and we write X m — X n —> 0 or 

X n+V — X n — -> 0, if it mutually converges outside a null set. It 
follows, by the Cauchy criterion and the fact that a countable union 
of null sets is a null set, that 


a. / sequence of a.e. finite functions converges a.e. to an a.e. finite func¬ 
tion t/ y and only ij、the sequence mutually converges a.e. 

a.e. 

Let X n ― > X. Since X n are taken to be measurable, X is a.e. 
measurable, that is, X is the a.e. limit of a sequence of simple functions. 

Also, if X f is such that X n —> X\ then X = X f a.e., for X can differ 
from X ; only on the null set on which X n converges neither to X nor 
to X f . Thus, the limit of the sequence X n is a.e. determined and 
a.e. measurable. Moreover, if every X n is modified arbitrarily on a 
null set N n> then the whole sequence is modified at most on the null 
set U N n and, therefore, the so modified sequence still converges a.e. 
toX 

These considerations lead to the introduction of the notion of “equiv¬ 
alent” functions: X and X f are equivalent if X = a.e. Since the no¬ 
tion has the usual properties of an equivalence — it is reflexive, transi¬ 
tive, and symmetric — it follows that the class of all functions on our 
measure space splits into equivalence classes, and the discussion which 
precedes can be summarized as follows. 


b. Convergence a.e. is a type of convergence of equivalence classes to an 
Equivalence class. 

In other words, as long as we are concerned with convergence a.e. of 
sequences of functions, these functions as well as the limit functions are 
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to be considered as defined up to an equivalence. In particular, we 
can replace an a.e. finite and a.e. measurable function by a finite and 
measurable function, and conversely, without destroying convergence 


a.e. 

Let us investigate in more detail the set on which a given sequence 
converges. To simplify, we restrict ourselves to the most important 
case of finite measurable functions, the study of the general case being 
similar. By definition of ordinary convergence, the set of convergence 
[X n X] of finite X n to a finite measurable X is the set of all points 
w G ^ at which, for every € > 0, | X(a)) — X n (co) \ < e (or n ^ 
sufficiently large. Since, moreover, the requirement “for every e > 0” 
is equivalent to “for every term of a sequence € 灸丄 0 as 々一 > oo，’’ say, 


the sequence 7 , we have 
k 


\x n x] = n un[\x n+! ， - x\ < c ] 

«〉0 ri v 

= nun[l^n + ,-^1 <^1- 

so that the set [X n —> X] is measurable. Similarly for the set of mutual 
convergence, since the set 


[x n+v — 尤 —> 0】 =n u n 

«>o n v 


is measurable. Thus 


=nun 

h n v 


[\x n 


+>， 


— X n < c] 


Xn-{-v _ Xn I 



c. The sets of convergence {to a finite measurable junction) and of mu¬ 
tual convergence of a sequence of finite measurable junctions are measurable. 

In other words, to every sequence we can assign a “measure of con¬ 
vergence” and, the sets of divergence [X n +> X] and [Xi+^ — +> 0 】 

being complements of those of convergence and, hence, measurable, to 
every sequence we can assign a “measure of divergence.” In particu¬ 
lar, the definitions of a.e. convergence of a sequence X n mean that 

^[X n 4> = 0 or n[X n + v — X n + > 0] = 0. 

Upon applying repeatedly the sequences theorem to the above-defined 
sets, we obtain the following 
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Convergence a.e. criterion. Let X y X n be finite measurable 
functions. 


X n —~> X if^ and only if 、 for every c > 0 ， 

^n\j[\x n+v - x\ ^ c ] = o 

n v 

ix is finite, this criterion becomes 

M U (| ^ I ^ — > 0. 

V 


— X n —> 0 if y and only if y for every € > 0, 

M fl U [| XnJf-v — Xn\ ^ e] = 0 

n v 

andy if /x is finite, this criterion becomes 

M U [I 厶 +P — 厶 I — o. 

V 

6.3 Convergence in measure. A sequence X n of finite measurable 
functions is said to converge in measure to a measurable function X 

and we write X n A X \( y for every € > 0 ， 

m[| X n — ZI ^ — 0. 

The limit function X is then necessarily a.e. finite, since 

M[| x I = °°] = m[| X n - x\ = oo] ^ m [| I ^ c] —> o. 

Similarly, X n+V — X n 0 if, for every € > 0 , 

m[| X nJt . v — 1 g €】 — 0 (uniformly in v). 

All considerations about equivalence classes in the case of convergence 
a.e. remain valid for convergence in measure. In particular, X n ^ X 

M 

and X n —> X\ then X and X’ are equivalent, for 


M [| n 丨 g c ] “ 




2」 


M 


I - A- I ^ 


2 


and, hence, 


^[X 〆 JT 卜 M U 

k 


Z ^ I ^ 


是」 


0 . 





[Sec. 6] 


MEASURABLE FUNCTIONS AND INTEGRATION 


117 


We compare now convergence in measure and convergence a.e. 

A. Comparison of convergences theorem. Let X n be a sequence of 
finite measurable junctions. 

If X n converges or mutually converges in measure 、 then there is a sub¬ 
sequence X nk which converges in measure and a.e. to the same limit func¬ 
tion. If /x is finite^ then convergence a.e. to an a.e. finite Junction implies 
convergence in measure to the same limit junction. 

Proof. The second assertion is an immediate consequence of the 

a.e. convergence criterion, since m finite and X n —» X imply that, 
for every c > 0 , 

m[| ^n+1 - ^ e] ^ mUI| X n+V - z| g c ] — 0. 

V 

As for the first assertion, let X nJrV — X n —> 0. Then, for every inte¬ 
ger k there is an integer n(k) such that, for n ^ n{k) and all v y 






2*」 


< 




Let «i = «(1), «2 = max (” i + 1 ， w ⑵)， ”3 = max (« 2 + 1,« (3)), etc., 
so that «i < «2 < < ... —> Let X'k = X nie and 


so that 


A k = I X' k+l - I ^ 



5 n = U ^ky 

k^. n 


< 


2 k 


pBn m Mk < 

n 



Thus, for a given c > 0, « large enough so that ^ w _ 1 < c, and all v t we 
have on B n c 

I I ^ Z I X' k+ 1 

k>n 2 

Therefore, 

M n U [I I ^ ^ M U [| ~ | ^ «] 


^ fj.B n < 


1 


2 n_1 


a.e. 

and, hence, by the convergence a.e. criterion, X'n+v — X’ n ― > 0 . 
Thus, by 6 . 2 a, there is a finite X f such that X ! n —> X f . Since on 





118_MEASURABLE FUNCTIONS AND INTEGRATION [Sec. 7] 

B n c we have I X' n ^. v — X' n | < c for all v it follows, upon letting y —> oo, 
that on B n c we also have | X' — ^n| < c outside perhaps a null subset. 
Therefore, upon taking complements, 

m[| X' — ^ nB n < — 0 ， 

fl fi 

so that —> X'. A similar argument shows that X n — X implies 

a.e. • 

X' n — > X. This completes the proof. 

Corollary. Convergence and mutual convergence in measure imply one 
another. 

Proof. If X n X, then, for every € >. 0 and all p y 
Mil -X n \^,e]^J\X n+y - 

_ • 

+ m I x — x n I ^ - —> o, 
_ — 

fl fl 

so that X n+y — X n 0. Conversely, if X n+V — X n — > 0, then, upon 
taking the subsequence X ni of the foregoing theorem, we obtain, for 
every c > 0, by letting »—><»， 

I X - X n k \ =2 + M \ X H~ Xn \~2 

fl • 

so that X n — X 、 and the corollary is proved. 

§ 7. INTEGRATION 

The concepts of onfield, measure, and measurable function are born 
from the efforts, made in the nineteenth and the beginning of the twen¬ 
tieth centuries, to extend the concept of integration to wider and wider 
classes of functions. The decisive extension was accomplished by Le- 
besgue, after Borel opened the way. Lebesgue worked with the special 
“Lebesgue” measure. Radon applied the same approach working with 
Lebesgue-Stieltjes measures. Finally, vFrechet, still using Lebesgue's 
approach, got rid of the restrictions on the measure space on which 
the numerical functions to be integrated were defined. 

Lebesgue had two equivalent definitions of the integral, a descrip¬ 
tive one and a constructive one. We shall use a constructive defini- 


n[\X-X n \^e]^^ 
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tion of the integral of which there are many variants, but the basic 
ideas are always the same and, in general, the integral is first defined 
for simple functions. Although infinite values are not excluded, never¬ 
theless, the expression + 00 — being meaningless, must be avoided. 
Therefore, it behooves us to start with integrals of functions of constant 
sign, say, nonnegative ones. Furthermore, the central property of the 
integral, called ‘‘the monotone convergence theorem,” says that for a 
nondecreasing sequence of nonnegative functions integration and pas¬ 
sage to the limit can be interchanged. Therefore, we give here the ap¬ 
proach aimed directly at this theorem, an approach which requires a 
minimum of notions and of effort. The reader will recognize in the 
central definition 2° below, a particular form of the monotone conver¬ 
gence theorem. 

7.1 Integrals. We consider a fixed measure space (i2, Cfe, /x )； B, 
•••，and X, y, •••，with or without affixes, will denote measurable 
sets and (numerical) measurable functions, respectively. 

Definitions 1° The integral on Q. of a nonnegative simple function 

m 

X = ^2 XjlAi ls defined by 



= E XjfiAj. 


2° The 
fined by 


integral on Q of a nonnegative measurable function X is de- 



X n 


where X n is a nondecreasing sequence of nonnegative simple functions 
which converges to X. 

3° The integral on U of a measurable function X is defined by 




X~ 


where X+ = XI [X ^o] and X~ = —XI[x<o] are the positive and nega¬ 
tive parts of X respectively, provided the defining difference exists, 
that is, provided at least one of the terms of this difference is finite. 

If I X dn \s finite, that is, if both of the terms of the difference are 

Jn 

finite, X is said to be integrable on Q. 
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Finally, if X is a.e. determined and measurable, that is, there exists 
a measurable function X' such that X = X' outside a /x-null set, 

we set j"X = j'X\ provided the right-hand side exists. 

Upon replacing, in the preceding definitions, 12 by a measurable set 
A (hence replacing, in 1°, every Aj = Q.Aj by AAj\ they become defi¬ 
nitions of the integral of X on A y to be denoted by f X d\i. Since, for 




X = ^ XjI Aj ^ 0, we have 


XI A = 2 XjfxAAj S l X d/JLy 


it follows immediately that 

if f Xdii exists so does r xd^ and f Xdn = f XIA dix. 


r A 


'A 


To simplify the writing, we drop d\i and U in the foregoing symbols, 
unless confusion is possible; thus, the symbols and {X dtx 


ia 


will be replaced by I X and I X, respectively. 


'A 


Justification and additivity. We have to justify the three definitions 
1°, 2°, 3°, that is, we have to show that the concepts as defined exist 
and are uniquely determined. In the course of the justification we 
shall have use for the elementary properties below; the first one is 
called the additivity property of the operation of integration. 

A. Elementary properties. X y ^ Yyj" X -{-J'y exist. 

I Linearity: 


J (x+Y) =J x+Jy, J ^=J x +J b x > j cX = c 

II Order-preservation: 

X = Y a.e. => f x= f Y . 
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III Integrability: 

X integrable <=» | A" | integrable => X a.e. finite; 

I A" I ^ y integrable => X integrable; 

X and Y integrable X -{■ Y integrable. 

Assume that the additivity property is proved. Then the second of 
properties I follows by replacing in the first one X by XIa and Y by 
XIb- The third one follows directly by successive use of the definitions. 

The successive use of the definitions also proves directly the first 
and third of properties II， and the second one follows by the additivity 
property upon setting X = Y Z, where Z ^ 0. 

Similarly for properties III, except for | X \ integrable => X a.e. 
finite. But, if \iA > 0 where // = [| | = oo], then, on account of II, 

=J *I |/a = cpA whatever be f > 0. It follows, by letting 
t — > oo, that f\X\ = «5, and the property is proved ab contrario. 


Thus 


For each of the successive definitions’ the elementary properties hold as 
soon as the additivity property is proved. 

We use this fact repeatedly in proceeding to the successive justifica¬ 
tions of the definitions and to the proof of the additivity property. 

m 

1 。 Nonnegative simple functions. Since X = 2 x j^ A . ^ nonnega- 

• j—i 3 

tlve, the defining sum in 

r% rn 

I ^ H = o 

J j=l 

exists; it may be infinite. Its value is independent of the way in which 

n 

X is written. For, if A" is written in some other form 2Z y>J Bk , then 

fc=l 

m n 

xj = yk if AjBk 0 and, from YL = Y h = % it follows that 

* j = 1 k = \ 


m r% 

Z = Z) XjfxJjBk = I Z) X iI Aj L 

j=l jJc J i，k 

n 

= Y^yk^jBk = 'll ykpBk. 
■i,k fc=l 


Z) yklA,B k 

j.k 
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Thus, J ^ is unambiguously defined. 

m n 

Let now X = 2Z Xjl A . and Y = 2Z ykl B be two nonnegative simple 
y=i * a：=i k * 

functions, so that A" + ^ = IZ (-vy + yk)I AjBk - Proceeding as above, 

we have 

f (X Y) = Z) ( x j + yk)\iAjBk = 2Z xjfiJjBk + Z) yk^jBk 

J jk jk jk 


m n 广广 

Z + X) yk^B k = \ X + j Y y 

7==1 /c=l J J 


and the additivity property is proved. 

2° Nonnegative measurable functions. In definition 2°, the sequence 
of simple functions X n ^ 0 is nondecreasing, so that, by All for sim¬ 
ple functions, the sequence JX n is nondecreasing and, hence, has a 

limit, finite or not. Moreover, for every nonnegative measurable func¬ 
tion X there exists such a sequence X n t X. Therefore, to justify the 
definition, it suffices to show that the defining limit is independent of 
the particular choice of the sequence X n . In other words 

If two nondecreasing sequences X n and Y n of nonnegative simple 
functions have the same limit’ then 

lim J'Xn = lim^ Y n . 

Proof. It suffices to prove that 0 ^ X n T X and lim X n ^ Y y where 
y is a nonnegative simple function, imply li mj*X n Y. For, then, 
it follows from the assumptions that, for every integer p. 


lim f^n^f Yp, lim Jy n ^ J 

and the asserted equality is obtained by letting /> —> «5. 

First, we prove the asserted inequality under the supplementary re¬ 
strictions 

/xi2 < °o, m = min Y > 0, M = max Y < <x>. 

Let € > 0 be less than m. Since lim 2 Y, it follows that A n = 
[Xn > y — e] I Q. But, on account of the validity of A for simple 
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functions and the finiteness of n and Y, we have 

= f^r,I An (Y - e)I An = f Y ~ t f ~ - ^A n 

y — — €\iA n 

and, hence, by letting « —> oo and then e —> 0, the asserted inequality 
follows. Now, we get rid of the supplementary restrictions. 

If /xO = oo, then 


Jx n ^J' X n I An (Y — e)I An g (w — — 00， 

and the asserted inequality is trivially true. 

If M = oo, then, the inequality being valid with X n and yi [K<；)0] -j- 
cI[Y=z^.oo] where r is an arbitrary finite number, we have 

+ r/x[Y^ = ~^) 


and, letting r — 00 , the right-hand side becomes J V. 

Finally, if m = 0, then, since the functions X n and Y are nonnegative 
and, by what precedes, the inequality is true for integrals on [Y > 0], 
we have 


lim Jz n ^ Hi 


f[Y>0] 


X n ^ 


L 


Y 


> 0 ] 


V. 


This completes the proof and the definition of the integral of a non¬ 
negative measurable function is justified. 

Since the additivity property was proved for nonnegative simple 
functions X n , Y n , and 0 ^ | Z, 0 ^ Y n T ^ imply 0 ^ X n + Y n ^ 

X Y,\t follows, by letting « —> « in 

f (x n + y n ) =J'x n +f y„, 

that 

f ( x + Y)=fx + fy. 


Thus, the additivity property remains valid for nonnegative measurable 
functions. 
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3° Measurable Junctions. The decomposition X : =- Z- of a 
measurable function into its positive and negative parts is unique, so 

that Jx=Jx + - f X — is unambiguously defined, provided J'X + 
-f X — is finite. 

Finally, if X is determined and ^measurable outside a /x-null set N’ 
then let X' be any measurable function such that X = X' on N c . The 

integral of X is defined by setting J'X = J'x / , provided J*X f exists. 

By All for nonnegative measurable functions, the integrals of such 
functions which coincide on N c are equal. It follows, by definition 
3°, that the same is true when the functions are not of constant sign. 

Therefore, ls unambiguously defined. 

It remains to prove the additivity property. 

Since we assume that not only J'X and J'Y exist but also that 

J*X + J^Y exists, that is, is not of the form +°° — it follows that 

(excluding the trivial case of the three integrals infinite of the same sign) 
at least one of the functions, say Y, is integrable and, hence, by AIII, 
is a.e. finite. Therefore, A" + Y" is a.e. determined, and we do not re¬ 
strict the generality by taking determined X and Y, and changing Y 
to 0 on the /x-null event on which it is infinite and X Y may be not 
determined. 

We decompose Q, into the six sets on each of which X y Y, and X Y 
are of constant sign (^0 or <0). Because of definition 3° and prop¬ 
erty AI for nonnegative functions, it suffices to prove the additivity 
property on each of these sets, say A = [X ^ 0, Y < 0, X Y ^ 0]. 
But, on account of definition 3° and the additivity property for non¬ 
negative functions (X + Y)Ia and — Y7 a, we have 


X= (X+Y) + (-Y) = (X+Y)- Y 

'a Ja j a j a Ja 


and, I Y being finite, 


X+ Y= (X+ Y). 
i Ja j a 


Similarly for the other sets, and the additivity property follows. 
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This completes the justification of the definitions and the proof of 
the elementary properties. 

7.2 Convergence theorems. The central convergence property is as 
follows: 

A. Monotone convergence theorem. I/O ^ X n | X y then | X n | 

f X . 

Proof. Choose nonnegative simple functions Xk m \ Xk m <x>. 
The sequence Y n = max X^n of nonnegative simple functions is non- 

k 

decreasing, and 

X kn X n) jXkn^jYn^JX n . 

It follows, by letting n — 沈 、 that 

X k ^ lim Y n ^ X, Y n ^ Hm JX n 

and, by letting 是 — oo, we obtain 

^ lim Y n g X, \\m jx n ^ j lim Y n ^ lim JZ n . 

Thus lim Y n = X and JX = limj^ X n . The assertion is proved. 

Corollary 1. The integral is a-additive on the family of nonnegative 
measurable junctions. 

This means that, if the X n are nonnegative, then J X 

n 

and follows by 0 ^ 5Z ^a ： T 2 ^n- 

A: =x=l 

Corollary 2. If X is integrable，then I | A" | — > 0 \iA —> 0. 
For, if X n = X or n according as | | < ri or \ X\ ^ «, then A" n | T 

X\ y so that, given e > 0, there exists an «。such that X\ < 

j 'I X no j + - • It. follows that, for A with \iA < e/2« 0 , 

J|^| =/ \X no \+f (|Z| -\x n0 \) <^+f\X\ -f\X no \ < e. 
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The monotone convergence theorem extends as follows: 

B. Fatou-Lebesgue theorem. Let Y and Z be integrable functions. 
I/Y^X n orX n ^Z y then ^ 

^lim inf X n ^ lim inf^ X ny resp. lim su pJ- f sup X n . 


IfY^X n ]X y orY^X n ^Z and X n 
Proof. If the X n are nonnegative, then 

Y n 


Xy then 


inf Xk t lim inf X ny 

k^n 


so that, by the monotone convergence theorem, 

lim inf^ X n ^ lim ^Y n =^lim inf X n . 


The asserted inequalities follow, by the additivity property, upon ap¬ 
plying this result to the sequences X n — Y and Z — X n of nonnega¬ 
tive measurable functions, and the asserted equalities are immediate 
consequences. 

Clearly, if the assumptions of this theorem hold only a,e., the con¬ 
clusions continue to hold. In fact, the last assertion, frequently called 
the dominated convergence theorem, extends as follows: 

C. Dominated convergence theorem. If \ X n \ ^ Y a.e. with Y 
integrable and if X n X or X n X y then In fact ， 

X — > 0 uniformly in A or y equivalently^ f | X n — X | — > 0. 


f X n - 

J A 


r A 


Proof. Since 




{X n — X) I I Xn — ^ i = ^ (^» — X) + + J*d - 幻 一， 


it follows that the last two assertions are equivalent and imply the 
first one. Thus, it suffices to prove that J*\ X n — X | — ()• Set 

Y n = \ X n — X \ and observe that Y n ^ 2Y a.e. and that the JY n 
remain the same when the Y n are modified on null events. Therefore, 
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it suffices to prove that, if 0 ^ Y n ^ Z integrable and Y n 0 or 
Y n A 0, then — 0. 

a.e. 

The case Yn —■> 0 follows from the last assertion in B. It implies 
the case Y n 0, since, by selecting a subsequence Y n > (A 0) such 

that J'Y n > —> lim su pJ" and, within this subsequence, a sequence 
Y n " —> 0, it follows that J'Y n " ~ > 0 and lim siipj*Y n = 0. Hence 
f Yn 0, and the proof is complete. 


Extension. In all the preceding convergence theorems the parameter 
n oo can be replaced by a parameter / —> / 0 along an arbitrary set 
T CZ R of values, the reason for this being that a t a as t i 0 along T 
is equivalent to a for every sequence t n in T converging to /o. 

Applications I. We assume all functions X t to be integrable. 

The dominated convergence theorem yields at once 

1 。 If \ X t \ ^ Y integrable and X t — X t 。as t t 0 (t C T) y then 


小 


This proposition yields, by applying the definition of derivative, 

" dX t . - X t0 . 

2 Ify on T, - exists at t Q and - : - ^ Y integrable y then 

' dt t - t 0 ~ ^ 


eH=/a ； 


In turn, this proposition yields 

0 r . • dX t dX t 

3 Ify on a finite interval [a, b] t - exists and - ^ Y integrable^ 

dt dt 

then t on [a, b] y 

d r x {dx t 
dt J J dt 

This follows from 


X t - X v = (/-〆) 




where t" lies between / and t'. And in its turn, this proposition yields 
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4° If, on a finite interval [a, b] y X t is continuous and | ^ | ^ y in- 
tegrable then, for every / C [^, ^], 




Moreover^ if the foregoing assumptions hold for every finite interval and 

r +00 " " 

\ X t \ dt Z integrable, then 


£Xf+=/(/:>) 


The integrals with respect to / are Riemann integrals. 

The first assertion follows from the fact that the derivative of a 

Riemann integral I g{t) where^ is continuous is^(/) which is bounded 
Ja 

on [a, b] y so that, upon applying 3° to the asserted equality, it follows 
that derivatives of both sides are equal and, since both sides vanish 
for t = a y the equality is proved. The second assertion follows by 1° 
from the first one, by letting a — » —<x> and 

II. Integrals over the Borel line. Let (B be the Borel field in R = 

( — 00 , +oo) and let be a measure on (B which assigns finite values to 
finite intervals. Let (B M be the class of all sets which are unions of a 
Borel set and a subset of a /x-null Borel set. (B M is closed under forma¬ 
tion of complements and countable unions and, hence, is a cr-field. By 
assigning to every set of (B M the measure of the Borel set from which it 
differs by a subset of a /.t-null set, /x is extended to a cr-finite measure 
on (B m , that we continue to denote by /x- will be called a Lebesgue- 
Stieltjes field in R and /x on (B M will be called a Lebesgue-Stieltjes measure. 
The relation 

F{b) - F(a) = F[a y b) = n[a y b) 

determines, up to an additive constant, a function F on R which is 
clearly finite, nondecreasing, and continuous from the left, called a dis¬ 
tribution function corresponding to /x. (It was proved that, conversely, 
such a function determines a Lebesgue-Stieltjes measure ii.) 

Let ^ be a ® M -measurable function. If g is integrable, the integral 

如 is called a Lebesgue-Stieltjes integral. If F is a distribution func¬ 
tion corresponding to this integral is also denoted by J g dF y and the 
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integral 


gdn is also denoted by h dF. If F(x) = x, x£R, 


the corresponding measure is called the Lebesgue measure; it assigns to 
every interval its “length” and, thus, is a direct extension.of the notion 
of length. The corresponding <r-field, or Lebesgue fields is formed by 

Lebesgue sets and the corresponding integrals, say J*gdx’ J* g dx, are 

called Lebesgue integrals. Lebesgue field, measure, and integral are 
prototypes of general <r-fields, measures, and integrals. One may say 
that the basic ideas and methods relative to measure spaces and inte¬ 
grals belong to Lebesgue. 

Let g be continuous ort [a, b]. The Lebesgue-Stieltjes integral I g dF 

^ a 

becomes then a Riemann-Sti eltjes integral and the Lebesgue integralj* gdx 
becomes then a Riemann integral. a 

The proof is easy. We have to show that, g being continuous on 

[a y b] y I g dF is limit of Riemann-Stieltjes sums. This is possible be- 
Ja 

cause a continuous function on a closed interval is bounded and is 
the (uniform) limit of any sequence of step-functions 

Sn = £ g{x' nk )I[ Xnk ,x H , k ^h a = x nl < • • • < X n , kn+X = b y 

^nk = > 

such that max (x nt k^\ — x n k) 0. Therefore, by the dominated con- 

k ^k n 

vergence theorem or, more specifically, by the last assertion of the 
Fatou-Lebesgue theorem, 


that is, 


gdyi = Hrn { g n d[i = lim X) g(^nk)^nky 


gdF = lim X) gi.x'nk)F[x nki 


where the right-hand side sums are precisely the usual Riemann- 

Stieltjes sums. Thus, in the case of g continuous on [a y b] y the integral 

广 6 • 

I gdF can be defined directly in terms of F y or of measures assigned 

J a 

to intervals only. 
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However, when g is continuous on R y its Lebesgue-Stieltjes integral 
over R and its improper Riemann-Stieltjes integral do not necessarily 
coincide. In fact, the last integral is defined by 



provided the limit exists and is finite. It may happen 
time 


^\s\dF 



that at the same 


is infinite so that | g | not being Lebesgue-Stieltjes integrable, g is not 
Lebesgue-Stieltjes integrable. Such examples are familiar; one of the 
most classical ones is that of the improper Riemann-integral of g(x )= 
sin x/x. However, if g is Lebesgue-Stieltjes integrable then, clearly, 
both integrals coincide. Thus, the class of continuous functions whose 
improper Riemann-Stieltjes integrals with respect to a distribution 
function F exist (and are finite) contains the class of continuous func¬ 
tions which are Lebesgue-Stieltjes integrable with respect to F. 


§ 8. INDEFINITE INTEGRALS ； ITERATED INTEGRALS 

8.1 Indefinite integrals and Lebesgue decomposition. We charac¬ 
terize now the indefinite integrals by using repeatedly the monotone 
convergence theorem. Let X be a measurable function whose integral 

exists—•say, J*JC - is finite. Then the indefinite integral <p on d de¬ 
fined by n 


妒 ㈤ 


f x " 


exists, for J* X~Ia is finite and jV/x exists. Since the integral of a 

function which vanishes a.e. is 0, the indefinite integral is [i-continuous y 
that is, vanishes for /x-null sets. Since for a countable measurable par¬ 
tition \Aj\ y X^Ia = L 义士 /aa ，，it follows that, by the monotone con¬ 
vergence theorem, 


and the indefinite integral is a-additive. 






[Sec. 8] 


MEASURABLE FUNCTIONS AND INTEGRATION 


131 


If X is integrable, then it is a.e. finite, and the indefinite integral is 
finite. If X is not integrable but, still, X is a.e. finite and n is a-finite^ 
then the indefinite integral is a-finite. For, by decomposing fi into sets 
A n of finite measure, we have 


/% w 

U= E E 

J m=« ― oo n^al v A n [m 


X 


<m4* 11 


and every term of the double sum is finite. 

The problem which arises is whether the foregoing properties charac¬ 
terize indefinite integrals and the answer lies in the celebrated Lebesgue 
(-Radon-Nikodym) decomposition theorem that we shall establish 
now. But first we introduce a notion in opposition to that of /x-conti- 
nuity. A set function <p 8 on d is said to be ^.-singular if it vanishes out¬ 
side a /x-null set; in symbols, there is a /Lt-null set N such that 

<Ps(^N c ) - 0 , Jca. 

A. Lebesgue decomposition theorem. If y on % the measure fi and 
the a-additive function <p are a-finite^ then there exists one，and only one y 
decomposition of <p into a ^-continuous and <x-additive set function <p c and 
a 卜 singular and <x-additive set function <p 8y 


= <Pc 


and <p c is the indefinite integral of a finite measurable function X deter¬ 
mined up to a ^-equivalence• 

<p c and <p 8 are called /x-continuous and /^-singular parts of <p y and X is 
called the derivative dip/d\x with respect to fx; we emphasize that dip/d^. 
is determined up to /x-equivalence. 

Proof. 1° Since fi is a countable sum of sets for which /x and <p 
are finite and since, by the Hahn decomposition theorem , 妒 is a differ¬ 
ence of two measures, it suffices to prove the theorem for finite measures 
jx and <p. Furthermore, if there are two decompositions of <p into a 
^-continuous and a /x-singular part: 


then 


<P = <Pc ~<Ps = <p f c <p\y 
<Pc — <p f c = <p f 8 — <p 8 = 0 y 


for the /x-continuous function <p c — ip’ c vanishes for all /x-null sets while 
the /i-singular function <p f 8 — <p 8 vanishes outside a /x-null set. Finally, 
an indefinite integral determines the integrand up to an equivalence: 
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if, for every ^ C 


灼 ⑻ =\ x= \ X' 

^ A J A 


then X = X f a.e.; for, if, say, \xA = fi[X 一 ^ 7 > €] > 0, then 

f A (X- X') > 0. 

Thus the uniqueness assertions hold if we prove the existence asser¬ 
tions under the assumption that /x and <p are finite measures. 

2° Let 伞 be the class of all nonnegative integrable functions X whose 
indefinite integrals are majorized by <p: 


X ^ Jca. 


4> is not empty, since X = 0 belongs to it; and there is a sequence 
\X n \ C <i> such that 

\ X n —» sup \ X = a ^ (p(U) < oo. 

J x 

Let X f n = sup so that 0 $ fi ^ svip 

k 


A k = [Z, = X' n 】， A f k = 為 … A k ^A ky A\ = A u 


so that 


and, for every 


^ k = U = ^ 

七 =1 A：=aal 


\ x\ = z = E ^ ^ E ^AA\) = <p{A). 

J A JAA'k k=~l •JAA'h k=l 

Upon letting « —> oo and applying the monotone convergence theorem, 
we get 

fx^ <p(J) t fx =a . 

Therefore is a “maximal” element of This property will allow us 
to show that 

*Ps = — <Pc ^ 0, 

where <p c is the indefinite integral of X, is ju-singular, and the proof will 
be complete. 
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3° Let D n + D n c be a Hahn decomposition for the finite and <r-addi- 

. . 1 . 

tive set function <p n = <p s - /x, that is, <p n {AD n ) ^ 0 and <p n {ADn) ^ 0 

n ' 

for every A. Let D = C\ D n (whence D c = U D n c ), so that, for every 
A and all 

„ 1 
0 ^ (p s {AD) S - n{AD). 
n 

Upon letting « —> °o, it follows that <p 8 {AD) = 0 and, hence, <p a {A)= 
<p a {AB c ). Since 

= <p{A) — <p a {AB c ) ^ <p{A) — <p s (AD n c )， 

it follows that 



X + » = _ + … ZV) “⑻， 


so that X ^ — / n c C But this conclusion is contradicted by 
n Dn 




unless /xD n c = 0. Therefore, all sets D n c are /x-null sets and so is 
their countable union D c . Since <p s {A) = <p a {AD c ), it follows that <p s 
is /x-singular, and the proof is complete. 

In the particular case of a /x-continuous ip, the foregoing theorem 
reduces to 


B. Radon-Nikodym theorem. If, on Q, the measure n and the a-addi- 
tive set function <p are a-finite and ip is ^-continuous ^ then ip is the indefinite 
integral of a finite function determined up to an equivalence. 

We are now in a position to characterize indefinite integrals of finite 
functions on a <r-finite measure space. 

C. A set function <p on (X is the indefinite integral on a a-finite measure 
space of a finite function X determined up to an equivalence ， if ， and only 
if y (p is cr-finite, a-additive^ and ^-continuous; and X is integrable if, and 
only ify this <p is finite. 

The “if” assertion is the Radon-Nikodym theorem and the “only if” 
assertion is contained in the discussion at the beginning of this sub¬ 


section. 
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Corollary. Let\ andn be a-finite measures on d. 1/n is \-continuous 
and X is a measurable function whose integral ^ Xdn exists, then., for every 


Xdn= f X — d\. 


Proof. If X = Ib, B C. then the equality is valid, since 


Ib dn = =ixAB 




It follows that the equality is valid for nonnegative simple functions 
and hence, by the monotone convergence theorem, for nonnegative 
measurable functions and, consequently, for measurable functions 
whose integral exists. 

Extension. The indefinite integral of a measurable function X which 
is not necessarily finite is still (r-additive and /x-continuous, but it is not 
necessarily cr-finite. The question arises whether the Radon-Nikodym 
theorem can be extended to this case. The answer is in the affirmative. 


D. The Radon-Nikodym theorem remains valid if finiteness of X and 
a-finiteness of <p are simultaneously suppressed therein. 

Proof. As usual, it suffices to consider a finite measure ix and a 
/x-continuous measure <p on ft. 

Let (B be the class of all measurable sets such that <p on (R \s cr-finite, 
and let s be the supremum of [i on (B. 

There exists a sequence 5 n C (B such that s = lim fiB n and, hence, 
B = \J B n c (& with /jlB = s. If there exists a C C \B c A y A 
such that 0 < <p(C) < », then 5 + C C /xC > 0, and 

s ^ fi(B + C) = nB + tiC > s. 

Therefore, while <p on \BA y A ^_Q\ is <r-finite, <p on {B c A y A ^_(X\ 
can take values 0 and « only. 

Furthermore, whatever be C C { B c A y A C a}, it is impossible to have 
/xC > 0 and <p{C) = 0 since then 5 + C C ® and, as above, s > s. 
Since <p is /x-continuous, it is also impossible to have /jlC = 0 and <p(C) 
>0. Thus, for every C G {B c A y A C either /xC > 0 and <p{C )= 
oo-/xC = oo or /xC = 0 and <p(C) = 0. In other words, ^ on [B c Ay 
A is the indefinite integral of a function = oo on B c y deter- 
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mined up to an equivalence. On the other hand, by B, <p on \BA y 
A Qi) is the indefinite integral of a function X on B, determined up 
to an equivalence. These values of X on B and on B c determine it on 
fi, up to an equivalence and, for every G Ct, 



f X+f X = <p{AB) + <p{AB c ) 




The extension follows. 

8.2 Product measures and iterated integrals. Let (fi,-, m), i = 

1, 2, be two measure spaces. A space (J2, Ot, ju) is their product-measure 
space if 

fi = fii X 5^2 is the space of all points co = (coi, C 02 ), C fit ； 

Ot = Ctj X (I 2 is the minimal <r-field over the class of all measurable 
“rectangles” 4 X 為，為 C ®t'，where A\ X is the set of all 
points co with 叫 C 

M = Mi X M2 is the “product-measure” on Ot，provided it exists, that 
is, is a measure on Ot uniquely determined by the relations X 
A 2 ) = n\A\ X M 2 為 for all measurable rectangles A\ X 為 . 


We intend to find conditions under which the product-measure ex¬ 
ists and conditions under which integrals with respect to this measure 
can be expressed in terms of integrals with respect to the factor meas¬ 
ures m- In what follows the subscripts 1 and 2 can be interchanged. 
We shall also frequently proceed to the usual abuse of notation which 
consists in the use of the same symbol for a function and for its values. 

For every set ^ C fi, the section A ux of A at cox is the set of all points 
C 02 such that (cox, co 2 ) G A , For every function X on fi, the section X UI 
of X at a；! is the function defined on by ^ 1 (^ 2 ) = ^(^i, a’ 2 ). 

a. Every section of a measurable set or function is measurable. 

If 6 is the class of all the sets in fi whose every section is measurable, 
then it is readily seen that 6 is a <r-field. But every section of a meas¬ 
urable rectangle A\ X 為 is measurable, since it is either empty or is 
one of the sides. Therefore, e ] Ct and the first assertion is proved. 
If on fi is measurable and 6 1 C is an arbitrary Borel set, the sec¬ 
ond assertion follows by 

^厂切） = [叫； 尤知 2 ) CS] = [ w 2 ; ^(^ i , w 2 ) G S] 

= [co 2 ； (wj, C02) C X~^ l (S)] = (X^iS))^. 
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A. Product-measure theorem. If on and^ on a 2 are <r-finite y 
then、for every ^ G X ® 2 , the functions with values iixA^ and 
are measurable、and the set function /x with values 

(Mi 义 2 ) 办 2 = J*(M2^ Wl ) ^Mi, 

is a <r-finite measure fi on dx X d 2 uniquely determined by the relation 

M (為 X 為） =mi 為 X M2 為， G tti- 

In other words, /x is the product-measure /xi X M2* 

Proof. The proof is based upon the fact that, by the monotone 
convergence theorem, the class 9TI of all those sets A for which the in¬ 
tegrals are equal is closed under formation of countable sums. 

Since the measures mi and M2 are <r-finite, the product space is de¬ 
composable into a countable sum of rectangles with sides of finite meas¬ 
ure. It follows that, without restricting the generality, we can suppose 
that these measures are finite. If X -^2 is a measurable rec¬ 

tangle, then mi 尤 2 = Mi^i X /a 2 (w 2 ) and similarly by interchanging the 
subscripts 1 and 2. Thus, the functions with these values are measur¬ 
able and both integrals reduce to [i\A\ X M2-^2 - The last asserted equal¬ 
ity is proved and 9TI contains all measurable rectangles. It follows that 
9TI contains the field of finite sums of these rectangles. But, 9TI is closed 
under nondecreasing passages to the limit, on account of the monotone 
convergence theorem, and, under nonincreasing ones, on account of the 
dominated convergence theorem and the finiteness of measures. There¬ 
fore, by 1.6, it contains the product onfield Gi X G2, and the equality 
of the integrals is proved. The finite set function /x on (i so defined is a 
measure, on account of the monotone convergence theorem, and it is 
uniquely determined by the stated relation, on account of the exten¬ 
sion theorem. This terminates the proof. 

Corollary. C X C ^2 ^ ^ (mi X ix 2 )-null set if y and only if y al¬ 
most every section is a p 2 -nu!I set. 

For the integral of a nonnegative function vanishes if, and only if, the 
integrand vanishes a.e. 

We are now in a position to answer the second stated question. The 
result is due to Lebesgue and Fubini and is generally called the Fubini 

THEOREM. 
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B. Iterated integrals theorem. Let ⑼， m) and (fi 2 , Ct 2 , M2) 
be a-finite measure spaces. 

If the X 0 . 2 -measurable function X on 9, x X fi 2 is nonnegative or 
Hi X \i 2 -integrable^ then 

f X M2) = f f X U1 dn 2 = f ^2 f X U2 dn u 

JU1XU2 Jill Ju% Jill Jill 

and. in the integrability case almost every section of X is integrable. 

The iterated integrals are to be read from right to left. 

Proof. For X = I A , the asserted equality reduces to that of the 
product-measure theorem. It follows that it holds for simple functions 
and hence holds for nonnegative measurable functions because of the 
monotone convergence theorem, since, if 0 ^ X n f X, then 0 ^ (X n ) Ui | 

( 义 ) 叫 . If ^ 0 is integrable, then the function JX UI d\i 2 of coi is in¬ 
tegrable and hence a.e. finite, so that the functions X Ul of co 2 are almost all 
integrable. Therefore, if X = X + — X~ is integrable, that is, X + 
and X~ are integrable, then (Z)% = (1+) 叫•一 (X~) Wi are almost all 
integrable and a.e. finite. This terminates the proof. 

Finite-dimensional case. What precedes extends in an obvious man¬ 
ner to the product of an arbitrary but finite number of measure spaces. 
The interesting case is the infinitely dimensional one, and we shall now 
investigate it from a somewhat more general point of view. 

*8.3 Iterated integrals and infinite product spaces. In what follows 
we push the abuse of notation to its extreme. 

We consider a sequence of measurable spaces (fi n , 0t n ) and denote 
by co n points of and by A n measurable sets in fi n (sets of 0t n ). The 
product measurable space (fii X • • • X 0„, a x X • • • X Ct„) is the space 
of points (cox, •. . ， o n ) together with the minimal <r-field over the inter¬ 
vals A\ X • • • X A n . The product measurable space (XI II ®n) is 
the space of points (coi, co 2 , • • •) and the minimal <r-field over all cylinders 

00 

of the form X • • • X X II Q ^； or, equivalently, over all cylinders 

A:=»n + 1 

00 

of the form C(5 n ) = B n X IX Oa ； where the base B n is a measurable 

A; 

set in fix X ... X fin. 

In the infinitely dimensional case, we must, for reasons of “consist¬ 
ency” (to be made clear later), limit ourselves to probabilities, that is, 
to measures which assign value one to the space, to be denoted by P, Q t 
•••，with or without affixes. Furthermore, in probability theory, the 





138 


MEASURABLE FUNCTIONS AND INTEGRATION 


[Sec. 8] 


following more general concept plays a basic role (at least when “inde¬ 
pendence” — see Part III — is not assumed). Every function — to be de¬ 
noted by P n (^i) .. . ， w n —l ； — which is a probability in A n for every 

fixed point (coi, … ， co n _i) and a measurable function in this point for 
every fixed A n will be called a regular conditional probability. For 
« = 1 it reduces to a probability P\ on ftx but for « > 1 it reduces to a 
probability on Q, n only when it is constant in ( 叫， … ， co n _ 1 ) for every 
fixed A„, provided the ordered T has a first element. We observe that 
the functions mA Ux = 卩 2 ( 叫 ； are regular conditional probabilities 
when ⑹ =1. On account of the monotone convergence theorem, 
iterated integrals of the form 


QnBn =J Pl(^l) J 尸 2 *( 叫 ; 心 2 ) ••• 

J * 尸《(叫， • • ... ，叫) 

define probabilities Q n on Q，i X … X It follows by the same theo¬ 
rem that if a measurable function on X ... X is nonnegative 
or j2n-integrable, then 


Ja t x 


•XQ» 


XdQn = I P\{do3 X ) I P 2 (^1 ； dw 2 ) 




t ( wi , ...，£«)„_!； …， co n ). 

A. Iterated regular conditional probabilities theorem. The 


iterated integrals 


QC(B n ) =j t Pi(^i ) … 

J* Pni^U … ， ^n-l ； … ， W„), 


determine a probability Q on Yi. 

This extension of the product-probability theorem is due to Tulcea and, 

proceeding as therein (in 1°), permits one to determine Q on an arbitrary 

JJ under obvious consistency conditions on the regular conditional 
tCT 

P r . S Ptn+li^tn *> ^tni ^tn+l)- 

Proof. To begin with, the definition of Q n on the class 6 of all cyl¬ 
inders of the form C(B n ) is consistent. For, if C(5„) = C(B m ), m < n, 
then integrations with respect to the co*. which do not belong to the 
product subspace where B m lies yield factors one. 

Since 0 on e is finitely additive, the assertion will follow by the ex¬ 
tension theorem if we prove that )2 on 6 is continuous at 0. We have 
to consider nonincreasing sequences of cylinders which converge to 0. 
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Upon renumbering the indices, we can suppose that the sequences are 
of the form C(5„) i 0 with nonempty bases 5„ G X • • • X a„. We 
can write 

(1) QC(B n ) = J P(M)Q (1) C(B n ) Wi 

where (B n ) ui is the section of B n at and 

Q {l) C{B n ) m P 2 (wr,^co 2 ) •- • - - *, co„_i ； ^co n )/ Bn (coi, - - •, co n ). 

In (1) the left-hand side is nonincreasing in «, and the integrand con¬ 
verges nonincreasingly to a certain limit ^(cox) ^ 0. By the dominated 

convergence theorem, the limit of the left-hand side is ^P{dw{)Xx{o 3 x). 

Assume that this integral is positive. Then there exists a point wi 
such that > 0. It follows that we find ourselves in the same 

situation but with the sequence Q ⑴ C(B n ) Sl instead of QC(B n ). Re¬ 
peating the argument over and over again, we obtain a sequence u = 
(wi, o} 2 , • * *) such that C} n C and Q M C(B n ) ai ,... taH | ^(ci n ) > 0. 
Therefore, every C(B n ) contains at least one point of the form ...， 

oo 

Jj n , co n+ i, .. Since C(B n ) = B n X H it contains the point o> 

A: n +1 

and, hence, C H ^(5 n ). Thus, when QC{B n ) +> 0 the intersection 
is not empty, and the theorem follows ab contrario. 

Particular cases. 1° If P n (coi, • • • ， co n ^\； A n ) = P yi A n are constant 
for every fixed A ny then we write Q — YLPn and call it a product- 
probability, Then the theorem reduces to the product-probability theo¬ 
rem in the denumerable case (4.2A). 

2° If the factor spaces are finite-dimensional Borel spaces, then, it 
follows from 27.2, Application 1, that the theorem yields the consistency 
theorem. 

COMPLEMENTS AND DETAILS 

Notation. Unless otherwise stated, the measure space (W, ft, fx) is fixed, the 
(measurable) sets A y B y • • •，with or without affixes, belong to ft, and the func¬ 
tions X y Yy • • •，with or without affixes, are finite measurable functions. 

1. The set C of convergence of a sequence X n (to a finite or Infinite limit 
function) is measurable. 

(C = [11m inf X n = Hm sup H.) 

2. If m is finite, then given X y for every € > 0 there exists A such that 
\lA < € and Jfis bounded on A e . If X\s bounded, then there exists a sequence of 
simple functions which converges uniformly to X. Combine both propositions. 



140 


MEASURABLE FUNCTIONS AND INTEGRATION 


[Sec. 8] 


We say that a sequence X n converges almost uniformly (a.u.) to X y and write 
X n > X y if, for every € > 0, there exists a set A with \iA < e such that 
X n X on A c . 

a.u. a.e. n 

3. If X n - > X y then X n - > X and X n — > X. (For the first assertion, 

form A n where A n Is the A of the foregoing definition with € = ~.) 


# a.u. 

4. If X n ― > X y then there exists a subsequence X n f - > X. 

5. Egoroff’s theorem. If /x is finite, then X n - > X implies that X n - > X. 

00 

Compare with 3. (Neglect the null set of divergence, and form A = \J A m 

m — l 

with A m = U \Xk — X\~^— and n{m) such that \tA m < 

k^n(m) L ” 2 」 2 

d. Lustres theorem. If fi iso*-finite, then X n — ^implies that X n > X on 
every element Aj of some countable partition of 0>-N where N is some null set. 
(Neglect the null set of divergence, and start with fx finite. Use Egoroff’s 

n 1 u 

theorem to select inductively sets Ak such that fx Ak < - and X n — > X on 
Ak c for every k) 

7. If is finite, then X n - > X Implies existence of a set of positive measure 

on which the X n are uniformly bounded. What if fi is <r-finite? 

8. If/xis finite, then X mn - > X m as w — oo and X m - > X asm oo imply 

that there exists subsequences rik such that X mknk - > X as k cc. What 

if fx Is (r-fimte? 

(Neglect the null sets of divergence. Select Ak and mk such that fiAk < 


and I X mk — X \ < ^ on A k c . Select Bk C ： Ak and n k such that fiB k < ^ 

and I X mknk — X mk I < ^ on ^/a ： — Bk-) 

9. Let X n ^X y Y n ^Y. Do aX n + bY n ^ aX + bY y \ X n \^\ X |, X n ^ A 


X 2 9 X n Y n XY? What about 1/X n ? Let n be finite and let ^ on i? or on 
R X Rbc continuous. What about the sequences g(X n ) and g(X ny Y n )? 

10. Let the functions X ny X on the measure space be complex-valued or 
vector-valued or, more generally, let them take their values in some fixed 
Banach space. Denote the norm of X by \ X \ y and denote \ X n ^ X\ 0 by 
Xn — X 

Transpose the constructive definitions of measurability and the definitions of 
various types of convergence. Investigate the validity of the transposed of the 
corresponding properties established In the text, as well as of those stated above. 

//. Examples and counterexamples of mutual implications of types of con- 
vergence. Investigate convergences of the sequences defined below: 

(1) The measure space Is the Borel line with Lebesgue measure, X n = 1 on 
[n y n + 1] and X n ** 0 elsewhere. 
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(li) The measure space is the Borel interval ( 0 , 1 ) with Lebesgue measure, 
X n = 1 on ( 0 , 5 ) and l = 0 elsewhere. 

(iii) The measure space is the Borel interval [0, lj with Lebesgue measure, 

the sequence is X\\ y X 2 \ y Xri y Xz\ y Xzi y Xzz y • - • with X n k = 1 on --- ,— 

and Xnk = 0 elsewhere. ^ n 

(iv) Q consists of all subsets of the set of positive Integers, \iA is the number 
of points of A y X n is indicator of the set of the n first integers. 

H If X is integrable, then the set [X 9^ 0] Is of <r-finite measure. What if 

/^exists? 

A?. Let {T y 3, r).be a measure space, to every point / of which is assigned a 
measure fxt on Q. Let the function on T defined by \x%A for any fixed A be 
3-measurable. 

The relation \iA = J \itA dr{t) defines a measure /x on ft. If 
exists, then the function defined on T by U{t) = dnt{<j}) exists and is 
3-measurable, and dfx(o)) = j U{t) dr{t). 

14. Let ip be the indefinite integral of X. Express <p^ y <p in terms of X. 

15. If ^ 0 uniformly in n as \iA — 0 or as / 丄 0, then the same is 

true of I I X n I ； and conversely. Interpret in terms of signed measures, 
f f i^nl = f X n -f X n .) 

\Ja JA[Xn>Q] JA[X n <0)) 

16. If finite finite, uniformly in A (C thenj^l X n — X\ 

0; and conversely. 

17* If 0 ^ X n ~■> X y then finite J^X n — J\x finite implies that J X n 
J* X uniformly in A (also ifis replaced by ~~ 

(0^(X- X n )^ ^ X integrate, and f(X - X n )^ -f(X - X n ) 0.) 

18. Rewrite in terms of integrals as many as possible of the complements and 
details of Chapter L 

19. If the X n are integrable and lim f X n exists and is finite for every A y 

J A 

then the J | X n | are uniformly bounded, J | X n | — ^ 0 uniformly in n as 
\xA —^ 0 and as j 0, and there exists an Integrable X y determined up to an 
equivalence, such that J X n J X for every A. (Use 18) 
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20. If integrable ^ X integrable, then existence and finiteness of lim ( X n 

for every A are equivalent to the following properties: 

(i) J' X n uniformly In A\ 

(li) I — 0 uniformly In n as \iA 0 and as yf | 0. 

If fi is finite, then “as / 丄 0” can be suppressed. (Use the preceding proposi¬ 
tions and the relations 


i\X n \^^\Xn~X\+^\X\, 


21 • The differential formalism applies to Radon-Nikodym derivatives: 

Let fji y v be finite measures on ft and <p y <p f be <r-finite signed measures on ft. 
Let <p be ^-continuous and v y <p y <p f be /x-contmuous. Then 

^(<P + <P f ) dip d<p f 

--- = 丁十 丁 M - a . e . 

ail afji d\i 

dip dip dv 

7 = 丁丁 
d\i. dv d\i 

(For the second assertion, it suffices to consider <p^ 0 y X = 0 

% , dv 

y = ^ ^ 0. Take simple X n with 0 ^ X n ^ X so that 

f Xdv \x n dv = f X n Ydn ^ f XYdyL.) 

A. A. A A. 

Let {nt, / G and {mVj ^ G T r ) be two families of measures on (2 ； we 
drop t 〔 T and 〆 d T r unless confusion is possible. We say that {/ue} !s 
-continuous if every set null for all /x’r is null for all n t . If the converse 
is also true, we say that the two families are mutually continuous. 

^2. If {m/} is a countable family of finite measures, then there exists a 
finite measure ju such that \nt} and fx are mutually continuous. (Take n » 

Zm,/ 離） . ， 、 . . 

23. Let the \it and fi be finite measures. If \fxt} Is ju-continuous, then there 
exists a finite measure 〆 such that {fit} and 〆 are mutually continuous. (Select 


sets At 


[!H. 


Denote by 5, with or without affixes, sets such that, for 


some /,5C At and \itB > 0. Denote countable sums of sets B up to /u-null 
sets by C, with or without affixes. Every subset C f d C with fitC f > 0 is a set 
C; every countable union of sets C is a set C. Let fxC n — s where s is the 
supremum of values of /x over all the sets C. Then J = /i(JC n = MU^ and 
to every m there corresponds a 叫 ， say n my such that B m [ A m and n m B m > 0. 
The families { 叫 } and {/Xm} are mutually continuous.) 
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n n 
24. Let p n = 2Z 抑 —A and v n ― Vk 

• " k. 


v y all the y. and v with various 


affixes being finite measures on ft and every V n being /Z n -contmuous. 


(0 


d\i\ djii 


M-a.e. 


djin dp. 

(il) if {Mn} is ^-continuous, then 學 5 一•辛 

dv dv 


^-a.e. 


(iii) v Is /Z-continuous and 




dv 


dfln dp. 


ju-a.e. 


(For the last assertion, If \x n A n = 0 for all n y then/Z (lim sup = 0. It fol- 

• dv n n 

lows that It suffices to consider a particular choice of the — = ^ ^ 

dv k dtlk _ „ v 办 w', ' 

where 石 =— ， Yi ： = — • But zl = -jz a nd ^Y n = lju-a.e.) 

The propositions which follow correspond to various definitions of the concept 
of integration. We shall assume that the measures and the functions are finite. 
Besides proving the statements, the reader should also examine removal of the 
restriction of finiteness as well as of other restrictions which may be introduced. 
25. Set 


^X d(p = ^^ “ 一， J *{X + iY) d\x = Jx d/x + ijY dfi y 

Jxd(jx + tv) = Jx dfj, + /J'jV dv 

and Investigate existence and properties of integrals so defined. 

26. Descriptive approach. The Radon-Nlkodym theorem characterizes an in¬ 
definite integral but not that of a given function. The following proposition 
answers this requirement. 

^ on ® Is indefinite integral of on Q if, and only if, <p is (r-additive and, for 
every stt A = [a S X ^ b\B y 5 C 

a\iA ^ <p{A) S b[kA. 

27. In the definition of the integral given in the text, start with (nonnega¬ 
tive) elementary functions instead of simple ones. The integral so defined coin¬ 
cides with the Initial one. 

28. Lebesgue s approach. The Cauchy-Riemann approach starts with arbi¬ 
trary finite partitions of the interval of integration into intervals. The Lebesgue 
approach consists in partitioning the set of integration according to the function 
to be Integrated so that the integral is tailored to order as opposed to the ready- 
to-wear Cauchy-Riemann one. Let ju < <»• 

Set 

If X is bounded, these sums correspond to finite partitions and fx = 
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Hm J2n(X). If X is not bounded, set X mn = X\f —m ^ X ^ n and X mn = 0 

otherwise. If X is integrable, then J' X mn J X 3.S n oo. 

If X is not bounded, the series ^2n(X) correspond to countable partitions 

and Jx = lim in the sense that if X is integrable, then these series 

are absolutely convergent and the equality holds and, conversely, if one of these 
series Is absolutely convergent, so are all of them and the equality holds. 

(For the last assertion, it suffices to consider nonnegative elementary func- 
. 00 走一 1 w 

tions X n = ^2 —c /「hi * 下 For the converse, use the relation 

XS2X n + ^Q.) L 」 

29. Darboux-Young approach. Let X be measurable or not and set 

= sup E inf X(a))fi^ky f X = inf sup X{(S)iiAk 

where the extrema of sums are taken over all finite measurable partitions 

n 

E If X is measurable and bounded, then 


jx=jx^fx. 


If J'-Sf and Jx exist and are equal, we say that JX exists and equals their com¬ 
mon value. 


We can also set 

f x = sup J Y, j x = inf J Z 

where the extrema are taken over all integrable (and measurable) Y and Z such 
that Y ^ X ^ Z and define J'jY' as above. Compare the two definitions. 

30. Completion approach. The Meray-Cantor method for completion of 
metric spaces adjoins to the given metric space elements which represent 
mutually convergent (in distance) sequences of Its points. This method permits 
(Dunford) to define and study the Integral of functions with values in an arbi¬ 
trary Banach space (Bochner), as follows: 

(i) Define the indefinite integral of a simple function as in the text. Since 
nonnegativity and infinite values may be meaningless, all simple functions 
under consideration are integrable. 

(Ii) Adjoin to the space of these integrable functions X my X ny • * 4 all functions 
X such that f I X m — X n I —> 0 and X n X y by defining the indefinite Inte- 
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gral of X as the limit of the indefinite Integrals of the X n * To justify this defini¬ 
tion, prove for simple functions those elementary properties of integrals which 

continue to have content for an arbitrary Banach space: J*| X m — X n | — 0 if, 

and only If, X n X where X is some measurable function, and J \ X n j —> 0 
uniformly In n as fxA —^ 0 ; J'j X m — X n | —^ 0 implies that <p n <p where <p 
is <r-addltlve. 

(Iii) Extend the foregoing properties to all integrable functions and obtain 
the dominated convergence theorem. 

31 • Kolmogorov’s approach. Let C be a class closed under intersections* Let 
3D, with or without affixes, be finite disjoint subclasses of 6 . Order them by the 
relation 3Di -< ©2 if every set of ©2 is contained in some set of 3Di* Fix A Q 
and consider all the 3D which are partitions of A. They form a “direction” △ 
In the sense that, if 3Di and ©2 are such partitions, then there exists such a 
partition which “follows” both, namely, 2 Di fl © 2 . 

Let p on e be a function, additive or not, single-valued or not. By definition, 


= j Jtp = lim ^2 <p{Aj) 

where the Aj are elements of partitions 2D of ^ and the limit 0{A) y if it exists, is 
“along the direction △，” that is, to every € > 0 there corresponds a 2D e such that 
j 0(A) — ^2 j < € for all 2D > ©< and all values of the <p{A ^—•if <p is 
multivalued. If 0{A) exists, it is unique. If 彡 on 6 exists, then it is finitely 
additive. 

Compare this integral to the Riemann-Sdeltjes Integral by selecting con¬ 
veniently <p. 

Compare J with the length (if it exists) of the arc aj3 of a plane curve, by 

taking <p{ak^i y ak) = ock-vock ，the length of the cord ock^i to ak y the a = ai, 
… afc-i, aky …, a n = /8 being consecutive points on the arc afi. 

We say that <p and 〆 on 6 are “differentially equivalent” on yf if, for every 
€ > 0, there exists a partition 2D« of A such that D | <p{Aj) — <p f {A 7 ) | < c for 

all 3 D > 3D«. If (p is finitely additive, then ^ ― 妒 (〆)• If not, then <p on 

^ fl C (if it exists) is the unique additive function differentially equivalent on 
A to <p. Proceed as follows: 

( 1 ) <p and <p f are differentially equivalent on A if, and only if, <p = <p\ 

(ii) <p and 0 are differentially equivalent on A. 

(Iii) If finitely additive functions <p and <p f are differentially equivalent on A y 
then they coincide on A. 

In all which precedes replace “finite” by “countable” and investigate the 
validity of the propositions so obtained. Compare the various definitions of 
the Integral, by selecting conveniently <p. 

Finally, take <p with values in a fixed but arbitrary Banach space, and go over 
what precedes. 

32. A structure of the concept of integration. The concept of integration Is con¬ 
structed by means of the concepts of summations and of passage to the limit 
along a direction or, more generally,* a cut-direction. A bipartition A = 4 + △ 
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of a set △ with an order relation < is a “cut-direction” if 4 and A are directions 
and every element of △ follows every element of A, 

Let ^ be a function, single-valued or not, on a direction A to a real line or 
a plane or, more generally, a Banach space. The element of the range space 
is “limit of p along A" if, for every e > 0, there exists an C A such that 
I <pa — <p(a) I < € for all a > a ( and for all values of <p{ot). If the direction A 
is replaced by a cut-direction A, then <p^ is “limit of p along A if, for every 
e > 0 there exist 这 《 C △ and C △ such that | — <p(a) | < c for all a such 
that a« -< a -< and for all values of (f{oc). If <p^ or <p^ exist, they are unique. 

To every a C △ assign some finite collection of points ay of a Banach space, 
not necessarily distinct and not necessarily uniquely determined. Form <p(a ) ― 

53 By definition, J J<p is the limit, if it exists, of <p along A. If A is re¬ 

placed by the definition continues to apply. 

Investigate all definitions of the integral you know of from this structural 
point of view, that is, the selections of A or 4, and of the functions <p. 

S3. Daniell approach. Let S be a family of bounded real-valued functions on 12, 
closed under finite linear combinations and lattice operations / [J g — max (f ， g\ 
/ 门《 =min (/，《)• Then/C L => \/\ =/ U 0 — / 门 0 C Suppose that 

on S is defined an integral a nonnegative linear functional continuous under 

monotone limits: / ^ 0 => J/ > 0 , J (af + bg) = a f/+ 為 JV，/n 丄 0 => J/ n |0. 

a) Let Ube the family oflimits (not necessarily finite) of nondecreasing sequences 
in S. U contains S and Is closed under addition, multiplication by nonnegative 

constants, and lattice operations. Extend the integral on U y settingj/ = lim J/ n 
when 5 3/» T/ (infinite values being permitted). 

The definition is justified, for if the nondecreasing sequences f n and g n in S are 
such that lim f n ^ lim g ny then limj/ n ^ limj'^n. 

lfU3/ n U then/ G C/andJ/„T //• 

b) Let 一 be the family of functions / such that — / G U y and set 

- J( - 几 

If C — t/, A G U and g S h y then h - g C U andJ*A - JW ( 卜 ㈣ . 

By definition, / is integrable if, for every c > 0, there exist —U and 

h t ^_U such that g ( ^ andj'l are finite, andJ*A« - < €• Then 

^ - and JV is defined to be this common value. 

Let L be the family of integrable functions. L and the Integral on L have all 
the properties of S and of the integral on S. 

If L 3/„|/ and lim J /„ < oo, then / G L andj/„| J/- 

Let be the smallest monotone family over S (closed under monotone passages 
to the limit by sequences), is closed under algebraic and lattice operations. 
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Let Li = L JL\ if and only if/C 5 and there existssuch that 

l/l … _ 

e) Let 5+ be the smallest monotone family over (consisting of all nonnegative 
functions of S). SetJ/ = °o if / C 5+ is not integrable. By definition, for 

/dj/=]>-//- exists if/+ or / 一 is integrable. 

IfJ/ andj^ exist and they are not infinite with opposite sign, thenJ* (/ + g) 
exists and equalsJ/ + JV 

ifj>„ exist, f/i > - 00 , and/„T/, then P exists and JmJ/. 

f) If I a G then, by definition, the measure of A is \iA = J*/a. 

If I Ay Ib G JF, then I A \jn y Iao B y C 5 and If the /a w G then Izi An 
C ^ and 

g) Suppose that Then and \f a > 0, 

then I[/>a) G 

If/ ^ 0, 1[f>a) G ^ for every a > 0 y then/ C 

h) Suppose that 1 C Then / C J/ — ^Jd\x where the right side Is 

taken in the customary sense. What if/C 

i) The family S is a real linear normed space with the uniform norm ||/ || = 

sup/. Every bounded linear functional <p(f) on this space is difference of two 
bounded nonnegative linear functionals <p(f) = 妒 +(/) — P 一 （/): Take 妒 +(/) 
=sup 0 ^ / r ^ /} on S^ y then extend to S by linearity. 

34. Riesz representation. Let SC be a locally compact space with points x y 
compacts K y and the cr-field S of topological Borel sets S y with or without sub¬ 
scripts. Let C be the space of bounded continuous functions 欠 ， with or without 
affixes, with the uniform norm || ^ || = sup g. Co C C consists of those ^ which 
vanish or infinity: Given c > 0 there exists a K ( such that | ^ | < c on K ( c . 
Coo CZ C consists of those g which vanish off compacts and Ck C Coo of those g 
which vanish off 【 If 9C is compact, then Cx = Coo = Co = C. 

a) Dint. If C Coo and gniO y then 丄 0 uniformly, that is, || g || |0. 

b) Nonnegative linear functionals fi{g) on Coo are bounded on every Ck and are 
integrals on Coo ： Bounded, since there exists C Coo 4 ' with 別 g 1 on C/c> 
hence g(^C K implies | g\ $ 《0 || <? |j and | n(g) | ^ n(go) || g ||- Integrals, since 
^•1 G Ck and 办丄 0 imply g n C C K , || g ||l0, hence | ii(g n ) | ^ fi(go) || gn |||0. 

c) There is a one-to-one correspondence between nonnegative linear functionals 
li(g) on Coo and measures fi(S) bounded on compacts, given by fJi(g )= 

J' fi{dx)g{x ) : By b) and 33, fi(g) determines the measure fx(S). 

d) There Is a one-to-one correspondence between bounded linear functionals 
<p(g) on Coo and bounded signed measures <p(S) on S given by <p(g) = J<f{dx)g{x) 
with || g || = : Var <p: Apply c) and 35i). 

e) There is a one-to-one correspondence between bounded linear functionals on 
Co and bounded signed measures on S. Compactlfy and apply d. 






Part Two 


GENERAL CONCEPTS AND TOOLS OF 
PROBABILITY THEORY 


Probability concepts can be defined in terms of measure-theoretic 
concepts. Since probability is a normed measure and random variables 
are finite measurable functions, the properties of sequences of random 
variables are more precise than those of measurable functions on a 
general measure space. Since in probability theory probability spaces 
are but frames of reference for families of random variables, probability 
properties are to be expressed in terms of the laws of the families only. 
These laws are expressed in terms of distributions which are set func¬ 
tions on the Borel fields in the range spaces. The distributions are ex¬ 
pressed in terms of distribution functions which are point functions 
on the range spaces. In turn, to distribution functions correspond their 
Fourier-Stieltjes transforms (called characteristic functions) which are 
easier to deal with. 

The following Parts utilize the tools so developed to investigate 
probability problems. These problems are centered about the con¬ 
cepts of independence and of conditioning introduced in Parts III and 
IV, respectively. The corresponding sections 15 and 24 may be read 
immediately after section 9. 
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§ 9. PROBABILITY SPACES AND RANDOM VARIABLES 

9.1 Probability terminology. Probability theory has its own termi¬ 
nology, born from and directly related and adapted to its intuitive 
background; for the concepts and problems of probability theory are 
born from and evolve with the analysis of random phenomena. As a 
branch of mathematics, however, probability theory partakes of and 
contributes to the whole domain of mathematics and, at present, its 
general set-up is expressible in terms of measure spaces and measurable 
functions. We give below a first table of correspondences between the 
probability and measure theoretic terms. Within parentheses appear 
the abbreviations to be used throughout this book. 


probability space (pr. space) 

elementary event 

event 

sure event 

impossible event 

probability (pr.) 

almost sure, almost surely (a.s.) 

random variable (r.v.) 

expectation E 


normed measure space 
point belonging to the space 
measurable set 
whole space 
empty set 
normed measure 
almost everywhere 
finite numerical measurable 
function 



We shall use the pr. theory terms or the measure theory terms accord¬ 
ing to our convenience. We summarize below in pr. terms the proper¬ 
ties which are specializations of those established in Part I. 
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/•A pr. space (fl, (J, P) consists of the sure event 12 , the (nonempty) 
(r-field GL of events and the pr. P on Ct. Unless otherwise stated, the pr. 
space ( 12 , ft, P) is fixed and B, • • •， with or without affixes, represent 
events. If so required, the pr. space can always be completed, so that 
every subset of a null event becomes an event 一 necessarily null. 

oo qo 

1 0 Ct is a c-field: for all A's^ A c , U Aj y f) are events. 

/-1 j-i 

It follows that、for every sequence A n , lim inf A ni lim sup A ny and 
lim A n {if it exists) are events• 

2° P z’j defined on Ct and，for all A's, 

P(Z = I ： m = 1. 

It follows that 

P0 = 0 ， VA ^ PB when A a B, P(U ^i) ^ Z P 沟， 

P(lim inf A„) ^ lim inf P^ n ^ lim sup P^ n ^ P(lim sup A n ), 
and, if lim A n exists, then P(Iim = lim VA n . 

II. A r.v . 义 is a function on n to /? = ( — <»，+ 00 ) such that the in¬ 
verse images under X of all Borel sets in R are events; it suffices to re¬ 
quire the same of all intervals, or of all intervals [a, i), or of all inter¬ 
vals ( — oo, b) y etc. 

An elementary r.v. is a function on 12 to of the form X = ^ XjI Aj 
where x/s are finite numbers, Aj's are disjoint events, and X = 
if there is only a finite number of distinct x/s, then X is a simple r.v. 

1° Every r.v. is the finite limit of a sequence of simple r.v!s and the 
finite uniform limit of a sequence of elementary r.v.'s; and conversely. 

Every nonnegative r.v. is the finite limit of a nondecreasing sequence of 
nonnegative simple r.v's; and conversely. 

2° The class of all r.v's is closed under the usual operations of analy¬ 
sis, provided these operations yield finite junctions. 

3° Every finite Borel junction of a finite number of r.v.’s is a r.v. 

A random junction is a family of r.v.’s; if the family is finite, it is a 
random vector, and, if the family is denumerable, it is a random sequence ， 
that is, a sequence of r.v.’s. 

III. Unless otherwise stated, X, Y, . • .， with or without affixes, will 
represent r.v.’s and, as usual, limits will be taken for n <x. 
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X n converges in pr. to X y and we write X n —^ X y if, for every e > 0, 

P[| X n -X\^e]-^ 0. 

X n converges a.s. to X, and we write X n X, if X n X, except 
perhaps on a null event (event of pr. 0 ) or, equivalently y if for every 
e > 0, 

P\J{\X k - X\^ e] 0. 

n 

. P a.s. 

Mutual convergence in pr. (X n — X m —* 0) and a.s. (X n — X m ― > 0) 
are defined by replacing above X n — X by X n — X m and Xk — X by 
Xk — Xi with ky l > n, and taking limits as m, n —> <x. 

1 。 X n ^ X i/, and only ij, X n - X m ^ 0. X n ^ X if, and 

m a.s. 

only if, X n — Xm. ― > 0. 

If X n 二 X then X n X. If X n —> Xy then there is a sub- 

a.s. # 

sequence —> X as k — 次 , with 

A ： s=l L 丄 - 

The terms “integral” and “expectation” and the notations J* and E 
will be considered as equivalent. In the case of r.v.’s, we have 

n 

IV. The expectation of a simple r.v. X = ^ x 山 is defined by 

A;=l 

n 

ex = 5D 々 p 為 . 

The expectation of a nonnegative r.v. 义 g 0 is the limit of expecta¬ 
tions of nonnegative simple r.v.’s X n which converge nondecreasingly 
to X: 

EX = lim EX ni 0^X n ^ X. 

The expectation of a r.v. X = X + — X~ is given by 

EX = EX + - EX~ 

provided the right-hand side is not of the form + 00 ~°°j and if EX 
exists and is finite, X is tntegrable. 
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1 0 X is integrable if, and only if, | | is integrable. 

If Xi and X 2 are integrable and a\ and a 2 are finite numbers, then 
a\X x + a 2 X 2 is integrable and E{a x X x + a 2 X 2 ) = a x EX\ + a 2 EX 2 \ if, 
moreover^ X\ ^ X 2 , then EX x ^ EX 2 . 

7/ 1 I I ^ X 2 and. X 2 is integrable, then X\ is integrable; in particu- 
lar, every bounded r.v. is integrable，and if X degenerates at a (X = a 
a.s.), then EX = a. 

The indefinite expectation <px of a r.v. X whose expectation exists is 
defined on the <r-field Ct of events A by = EXIa- 

2° <px on Ct is c-finite, <r-additive, and ^-continuous; if X is integrable^ 
then <px is bounded by E\ X |, and 0 as PA —> 0. 

3° Monotone convergence theorem. If 0 ^ X n ] X finite or 
not, then EX n t EX., if EX is finite, then the measurable Junction X is a.s. 
a r.v. 

p 

Dominated convergence theorem. 1/ X n X and | |' ^ Y 

integrable, then X is integrable, and EX n —* EX. 

Fatou-Lebesgue theorem. If Y and Z are integrable r.v's and 
Y ^ X n or X n ^ Z, then 

£(lim inf X n ) ^ lim inf EX n or lim sup EX n ^ £(lim sup X n ). 

IJ y moreover^ lim inf EX n or lim sup EX n is finite, then, respectively y 
lim inf X n or lim sup X n is a.s. a r.v. 

Equivalence. Two functions on are equivalent if they agree out¬ 
side a null event. Convergences in pr. and a.s., integrals and integra- 
bility are, in fact, defined for equivalence classes and not for individual 
functions. Therefore, as long as we are concerned with a sequence of 
r.v.’s we can consider every r.v. of the sequence as defined up to an 
equivalence. In particular, we can then extend the notion of a r.v. 
as follows ： a r.v. is an a.s. defined, a.s. finite and a.s. measurable func¬ 
tion. 

Let us observe, once and for all, that when the measurable functions 
under consideration are by definition (B-measurable whefe (B is a sub 
<r-field of events, then almost sure relations are P®-equivalences，that 
is, valid up to null (B-measurable sets. 

The complex-valued case. A complex r.v. X is of the form X = 
X' + iX” where X' and X" are “ordinary” or “real-valued” r.v.’s as 
defined at the beginning of this section and where i 2 = —l;X takes 
its values in the complex plane of points x' + ix'\ that is, in the plane 
R X R, and its expectation is the point EX = EX' + iEX". In other 
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words, a complex r.v . 义 is a representation of the random vector 
{X'j X"). Similarly, a complex Borel function g = g' -ig" is a rep¬ 
resentation of the Borel vector g f, \. The definitions and properties 
given below of random vectors, random sequences and, in general, ran¬ 
dom functions extend at once to the complex case where the compo¬ 
nents instead of being ordinary r.v.’s are complex-valued r.v.’s or, 
equivalently, two-dimensional random vectors. The relation | EX | ^ 
五 | 义 I is still true; it suffices to use polar coordinates, setting X = pe ta ， 
EX = re lt , and observe that 

r = e~ xt Epe la = Ep cos (a — /) ^ Ep 

*9.2 Random vectors, sequences, and functions. A random vector 
X = (Xi, …， X n ) is a finite family of r.v.’s called components of the 
random vector. Every component Xk induces a sub <r-field (&{Xk) of 
events — inverse image of the Borel field in the range-space Rk of Xk. 
The random vector has for range space the »-dimensional real space 

n 

R n t Jl Rk with points x = (xi ，.. •，^ n ) and it induces a <r-field ( 6 (^) 

k 篇 i 

= ( 6 (^ 1 , X2, …， X n ) — inverse image of the Borel field in R n . The 
inverse images of intervals (—<» ， x) C R n are events 

[■X* < = [Xi K Xi, • • •, X n < Xn] = 门[右 < 尤 *：] 

k~\ 

and, hence, are intersections of events belonging to the (R(Xk). Since 
the Borel field (B n in R n is the minimal <r-field over the class of these 
intervals, the <r-field (B(J\T) is the minimal <r-field over these intersections 
or, equivalently, over the union of the (B(Xk) — a compound or union 
a-field ( 6 (^ 1 , - - • X n ) with component <r-fields (B(^Gk). Thus, the elements 
of (&{X) are events and the random vector X can be defined as a meas¬ 
urable function on the pr. space to the »-dimensional Borel space (R n , (B n ). 
We define EX to be (EXi, EX2, - - *, EX n ) — a point in the space R n . 

A random sequence X = {X\^ X2, ...) is a sequence of r.v.’s called 

00 

its components; it takes its values in the space R n = U R n of points 

n =* 1 

x = (^ 1 , X2i ... )， that is, the space of numerical sequences. To every 
point x with an arbitrary but finite number of finite coordinates 乂趴 ，… 
Xk n there corresponds the interval (—<» ， x) of all points y such that 
y kl < Xk l} ''' yk n < x kn} an d the minimal <r-field over the class of these 
intervals is the Borel field (B* in R n . Exactly as for random vectors, 
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it follows that the inverse image under X of (B 00 is the minimal <r-field 
over the class of all finite intersections of events A n G ®(D — the 
compound or union cr-field (R(X) with component cr-fields (R(X n ) — then we 
write (& (X) = (R(X\ y X 2 y • • •) and the random sequence can be defined 
as a measurable function on the pr. space to the Borel space (R' 
Similarly, the definition of the expectation of the random sequence is 
EX = [EX\ y EX 2 y • • •} — when EX\ y EX 2 , • • • exist. 

A random function Xt = {Xt y t T) \s a family of r.v/s Xt where 

/ varies over an arbitrary but fixed index set T. Exactly as above, the 

range space of Xt is the real space Rt ^ Rt of points xt = (xt y 

t c t 

^ G T) — the space of numerical functions; intervals ( 一 °o ， Xt) are de¬ 
fined for points Xt with an arbitrary but finite number of finite coordi¬ 
nates to be sets of all points yr < x Ty that is, y t < x ty t C. T; the Borel 
field (&t is the minimal onfield over the class of these intervals. The ran- 
dom function Xt induces the compound or union cr-field (S>{Xt) with com¬ 
ponent cr-fields (B(^) — the minimal <r-field over the class of all finite in¬ 
tersections of events A t G (B(^) as / varies on T or, equivalently, the 
inverse image under Xt of the Borel field (B^; and the random function 
Xt can be defined as a measurable function on the pr. space to the 
Borel space (R Ty (R T ). By definition, EXt = [EX t , / G is a numer¬ 
ical function — when the EX t exist. 

A Borel function g T > is a function on a Borel space {Rt, (S ， t) to a Borel 
space (Rt ,，(BrO such that the inverse image under gr> of the Borel 
field in the range space is contained in the Borel field (Br in the domain 
Rt- Therefore, if X T is a random function to R T , then the function of 
function gT>(Xr) on the pr. space to the Borel space (R T >, (R T >) induces 
a sub <r-field of events — inverse image under Xt of the inverse image 
under g T > of the Borel field Thus, (S,{gT'(X T )) C (8 (^)； in other 

words, gT>{X T ) is (B(^ r r )-measurable and,- hence, is a random function. 
We state this conclusion as a theorem. 

A. Borel functions theorem. A Borel junction of a random junc¬ 
tion is a random junction which induces a sub a-field of events contained 
in the one induced by the original random junction. 

Loosely speaking, a Borel function of a random function induces a 
“coarser” sub <r-field of events and has “fewer” values. 

9.3 Moments, inequalities, and convergences. Expectations of pow¬ 
ers of r.v.’s are called moments and play an essential role in the investi¬ 
gations of pr. theory. They appear in the simple but powerful Markov 
inequality and in the definition of the very useful notion of convergence 
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“in the rth mean,” that we shall introduce in this subsection. They 
appear in the expansions of “characteristic functions” that we shall 
examine in the next chapter. They play a basic role in the study of 
sums of “independent” r.v.’s to which the next part is devoted. Fur¬ 
thermore, the powerful “truncation” method — to be used extensively 
in the following parts — expands tremendously the domain of applica¬ 
bility of the methods of investigation based upon the use of moments. 

EX k (k = 1, 2, • • •) and 五 | l| r (r > 0) are called, respectively, the 
是 th moment and the rth absolute moment of the r.v. X. We may also 
consider 0th moments but, for all r.v.’s, the 0th moments are 1, and 
we shall limit ourselves to 是 th moments where ^ is a positive integer, 
and to rth absolute moments where r is a positive number, unless other¬ 
wise stated. 

We establish now a few simple properties of moments. While a 是 th 
moment may not exist, absolute moments always exist but may be in¬ 
finite. Since integrability is equivalent to absolute integrability, if the 
是 th absolute moment of X is finite, then its ^th moment exists and is 
finite; and conversely. More generally, since | 义丨 r, S 1 + | 义丨 r for 
0 < r’ < r, we have 

If E\ X\ r < « 3 , then E\ X \ r> is finite for r' 4 r and EX k exists and 
is finite for k ^ r. 

In other words, finiteness of a moment of X implies existence and finite¬ 
ness of all moments of X of lower order. 

Upon applying the elementary inequality 

\ a b\ r ^ c r \ a\ r c T \ b | r , r > 0, 

where c r = 1 or 2 r_1 according as r ^ 1 or r ^ 1, replacing a by X y b 
by Y and, taking expectations of both sides, we obtain the 

G-inequality. E\ X Y \ r ^ c r E\ X\ r - c r E\ Y | r , where c r = \ or 
2 r — 1 according as r ^ 1 or r ^ 1. 

This inequality shows that if the rth absolute moments of X and Y 
exist and are finite, so is the rth absolute moment of X -Y. 

Similarly, excluding the trivial case of vanishing E\X\ r or E\ y| s (in 
which case the Holder inequality below is trivially true), and replacing 

a by X/E T \X\ r , b by Y/E~ a \Y\ s in the elementary inequality 

. 卜 | r 丨冲 1 1 , 

I ^ ^ - 1 - - r > 1, — I — = lj 

r s r s 


158 


PROBABILITY CONCEPTS 


[Sec. 9] 


we obtain the 

i i 

Holder inequality. E\ XY\ ^ E r \ -ST| r -£*| Y |*, where r > 1 and 

1 1 , 

一 — = 1 # 
r s 

From this inequality follows the 

Minkowski inequality. I/r ^ l y then 

i i i 

E'r\ X+X r \ r ^E'r\ X\ r + Er\X f \ r . 

In fact, upon excluding the trivial case r = 1, and applying the Holder 
inequality with Y = \X X' | r_1 to the right-hand side terms in the 
obvious inequality 

E\X-\-X'Y SE(\Xl\X+X' I"- 1 ) + E{\X'l\X-\r X f 卜 ), 
we find 

E\X-\-X'\ r ^ ZI 1 • + E'r\ X' | r )£*| X+ X' l 。- 1 )*， 

11 • • . 
where — | — = 1. Upon excluding the trivial case of vanishing 
r s 

E\ X + X f | r , noticing that (r — l)s = r, and dividing both sides by 

E~ s \ X -{- X f \ r y the asserted inequality follows. 

Holder's inequality with r = j = 2, is called the 

Schwarz inequality ： E 2 \ XY | ^ E\ X | 2 -£| Y\ 2 . 

r-V r+〆 

Replacing X by\ X\ 2 and Y" by | -X - 1 2 , with r' ^ r, and, taking 
logarithms of both sides, we obtain the inequality 

lo S E\X\ r log E\ X \ r ~ r， + h log E\ X | r+r， 

b. log E\ X I r is a convex function of r. 

Holder^ inequality with X, Y, r, s replaced respectively by | | p , l p , 

plr 、 q/r ( hence - = - + -) becomes E llr \ | r ^ E l,p \ X \ p for r < p. 
\ r p q. 

Hence, 

c. E l,r \ X \ r is nondecreasing in r. 

In fact, E llr \X\ r T E xlp \ Z | p as r 丁 p. For, if E\ X \ p < oo then 
\ X\ r ^ max(l, I I p ) and the dominated convergence theorem applies. 
E\ X \ p = cx> apply what precedes to Y n = | X \I[\x\<n) then let 







[Sec. 9] 


PROBABILITY CONCEPTS 


159 


We introduce now convergence in the rth mean. Let X n and X be 
r.v.’s with finite rth absolute moments, so that, by the c r -inequality, 
the same is true of X n — X. We say that the sequence X n converges 
to X in the rth mean, and write X n X, if E\ X n — X\ r 0. 

Let X n —^ X. If r ^ 1 then it follows, by the c r -inequality, that 

\E\X n \ r - E\X\ r \ ^E\X n - X\ r —0 ， 

and, if r > 1, then it follows, by the Minkowski inequality, that 

I E^\ X n \ r - E^\ 对丨 S — 0. 

This proves that 

d. I/X n X, then E\ X n \ r -> E\ X\ r . 

We conclude this subsection with a simple but basic inequality and 
a few of its applications. 

A. Basic inequality. Let X be an arbitrary r.v. and let g on R be a 
nonnegative Bore!function. 

If g is even and is nondecreasing on [0, + 00 ) then, for every a 

Eg{X ) -《⑷ • | Eg(X) 

a.s. sup g(X) g(a) 

If g is nondecreasing on R, then the middle term is replaced byP [X ^ a], 
where a is an arbitrary number. 

The proof is immediate. Since g is a Borel function on R, it follows 
that g(X) is a measurable function on and, since g is nonnegative on 
R } its integral exists. If g is even and is nondecreasing on [0, + 00 ), 
then, setting ^ = [| | ^ a], from the obvious relations 

E g (x) = J g (x) 

and 

g{a)PA ^ J* g{X) ^ a.s. sup 0 ^j^g(X) ^ g(a) } 

it follows that 

g{a)PA ^ Eg(X) ^ a.s. sup g(X) - PA + g{^). 

This proves the first assertion and the second is similarly proved. 
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Applications. (1) Upon taking g(x) = e rx (r > 0), we obtain 
Ee rX - e ra 

- ^ ^ P[X ^ a] g, e~ ra Ee rX 

a.s. sup e rA 

⑵ Upon taking g(x) = | x | r (r > 0) we obtain 

E\X\ r -a r . . E\ X\ r 

^P[\X\^a]^ 1 1 


a.s. sup 


X 


a. 


the right-hand side inequality is called the Markov inequality^ and for 
r = 2 it reduces to the celebrated Tchebichev inequality. 

Upon applying Markov’s inequality with X replaced by X n — X y it 
follows that 


p 


If X n — X、 then X n — X、 and if the X n are a.s. uniformly bounded, 
then 、 conversely 、 X n — X implies that X n — X. 


(3) Upon taking g(x) 
X\ r 


I 冲 


I 冲 


(r > 0)，we obtain 


E 


a. 


^P[\X\ ^ a)^ ] -^-E 


X\ r 


replacing Xby X n — X and by X m — X ny it follows that, as m y n — 


00 % 


X n ^ X if、and only if, E \ Xn ~ X \ 


X m — X n ^ 0 if、and only if 、 E 


i + lur 
|Unj r 
l + \X m -X n \ r 


0 ； 


0 . 


E 


Remark. Observe that the function defined by d{X y Y)= 

\X-Y\ . ... . 

- ； - r. has the triangular and identification properties of a 

l + \X-Y\ 8 

distance, except that d(X y Y) = 0 implies only that X = Y a.s. It 

follows from the foregoing proposition that 

The space of the equivalence classes of the r.v.'s defined in a pr. space is 
a complete metric space with distance d defined by 


d{X 、 Y) = E 


X-Y 


1 + \X-Y\ 

and convergence in distance is equivalent to convergence in pr. 
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•Convex functions. The relations between moments established at 
the beginning or this subsection are essentially convexity properties. 
Let us recall a few classical properties of convex functions. 

Let ^ be a (numerical) Borel function defined on a finite or an in¬ 
finite open interval I a R. g is said to be convex if, for every pair of 
points jc, x' of /, 

(x + 1 1 

八 —y-J s y W + 

if g is twice differentiable on /， then the convexity property is equiva¬ 
lent to g" ^ 0 on /. The same definition applies to g on an iV-dimen- 
sional interval I N and is equivalent to the convexity of the function 
gix + ux') of the numerical argument u for all values of u for which 
x + ux' C so that it suffices to consider convex functions on I d R. 
A convex function on / is either continuous on / or is not a Borel func¬ 
tion. Thus, from now on, a convex function will be assumed to be con¬ 
tinuous on its domain. In that case, g is convex on / if, and only if, 
to every x 0 C. I there corresponds a number X(^ 0 ) such that, for all 
xCI ， 

Hxo)(x - x 0 ) g g(x) - g(x 0 ). 

Let X be a. r.v. whose values lie a.s. in / and whose expectation EX 
exists and is finite. Replacing by EX and by X, and taking the 
expectation of both sides of the foregoing inequality, it follows that 

e. 1/ g is convex and EX is finite, then 

g(EX) ^ Eg(X). 

If g is strictly monotone, then this relation can be written 

g-\Eg{X)). 

For example, for r ^ 1, g{x) = x r (x C (0, +oo)) being convex, we have 
E\ X\ ^ E llr \X\ r . 

More generally, let Gi and be two continuous and strictly increas¬ 
ing functions such that g = G 2 G -1 is convex; we say then that G 2 is 
convex in Gj. Since Y = Gi(X) implies that X = Gi - 1 (y), it follows 
by e, upon assuming that EX and EY are finite, that 

GzGr^EY) ^ EG 2 G x ~\Y) 

and, hence, 

e # . If G 2 is convex in then 

G l ~ 1 (EG l (X)) ^ G 2 ~\EG 2 (X)). 
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For example, since on (0 ， +oo), x r2 is convex in x n for r 2 ^ that is, 
the function x n/n is convex, we have 

E^\X\ Tl ^E^\X\ rt for r 2 ^ r x . 

*9.4 Spaces h r . The r.v.’s whose rth absolute moments are finite 
are said to form the space L r over the pr. space (12, （ i ， P); in symbols, 
X L r \( E\ X\ r < oo ； we drop r if r = 1, We shall find later that 
the space is a very important tool in the investigation of pr. prob¬ 
lems, especially those relative to sums of “independent” r.v.’s. It will 
be convenient to introduce two boundary cases. The first is the trivial 
space Lq of all r.v/s X since E\ X\° = 1 is finite. The second is the space 
Loo of all a.s. bounded r.v.’s. Since lim Z| < oo if, and only if, j -X - j ^ 1 

T — ► oo 

a.s., it seems that only the subspace L'^ C of r.v.’s a.s. bounded 

" l 

by 1 ought to be introduced. However, for r — oo it is lim E r \ X\ r 
which counts, and this limit is finite if, and only if, X is a.s. bounded. 
In fact, let s be the a.s. supremum of | X |, defined by P[| X \ > j] = 0 
and P[\X\ ^c]>0 for every c < s; we have j ^ oo. The foregoing 
assertion is implied by 

a. £"| 1 00 = lim £ F | X\ r = a.s. sup | X | = s. 

r 一 ' ► oo 

For 

.^ Er\ X\ r ^ Ek\ X\ r I uxl ^ c] ) ^ cp\\ X\^c]^s 
as r ^ oo, then f | 

The foregoing definitions permit us to state 9.3a as follows: 

b. Lq Z) L r ZD L s Z) ZD L'^ 0 ^ r ^ j ^ oo. 

Let us observe that the space of all simple r.v.’s is a subspace of L 龙 
and, hence, of all the spaces L r . 

Since, by the c r - and Minkowski inequalities and by a, 

E\X+Y\ r ^E\X\ r + E\Y\\ 0 <r < 1, 

E^\ X+ Y\ r ^ Er\ X\ r +^\ y| r , 1 ^ r ^ 

and E\ X — y | r = 0 if, and only if, X and Y are equivalent, we have, 
according to the definitions relative to metric and normed spaces, the 
following theorem. 
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A. The spaces L r are linear metric spaces with metric defined by 
d{X ， Y) = E\X -Y\ r for 0 < r < 1 


and norm 


E } r\x\ r 


jor 1 ^ r ^ oo, 


provided equivalent r.v.'s are identified. 


The problem arises whether the spaces L r are complete and what are 
the convergence theorems in these spaces. Unless otherwise stated, 
from now o» 0 < r < (the reader is invited to examine in each case 
the boundary spaces Lq and L^). 

First we observe that on account of A and 9.3d we have 


c. Convergence in distance d{X n 、 X) — Q in L r is equivalent to con¬ 
vergence in the rth mean X n — X and implies convergence of distances 
d{X n , X 0 ) d{ X, X 0 ) to any fixed X Q C L r . 

Also, if X n C Lrt then, for a r.v. E\ X n — X\ r y which always exists, 
can converge to 0 only if, from some value of n on, Z| Xi — | r is 

finite and, hence, only if G Z, r , so that 

d. If X n is a sequence in L r and E\ X n — X\ r 0^ then X d 
We are now in a position to prove the 

B. Lj-completeness theorem. Let the X n C L r . Then X n — 
some X if 、 and only if 、 X m — X n 上 * 0、 as m, n — 

Proof. If X n 二 X, then X m — X n 0, since, by the f r -inequality, 
E\X m - | r ^ c r E\ X m -X\ r + c r E\ X-X n \ r ^0. 

Conversely, if X m — 二 0 ， then, by the Markov inequality, for 
every 6 > 0 , 

P[| Xm- x n \ ^€]^^ r E\X m - X n \ r ^0 as w ，《 — co ， 

p ^ ^ a.s. 

so that X m — X n 0. Therefore, there is a subsequence X n ' — > 

some X as n f co and, for every fixed m y X m — Xrl —> 一 A as 

n f — ^ oo. Since E\ X m — Xrl | r — ^ 0 as n f °o, it follows, by the 
Fatou-Lebesgue theorem and the hypothesis, that 

E\X m ~ X\ r ^Ynn\nUE\X m - X n '\ r ^ 0 as w 
Thus, X n X, and the proof is complete. 
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If a r.v. X is integrable, then the (indefinite) integral of X is P-ab- 
solutely continuous: f | X | — 0 as PA —> 0. Let B = l\X\^a]. 

Since PB — 0 as a — <»， it follows that j*| 义 | — 0 as a — oo. 
Conversely, this implies that 


L lxl= L lxl+ fJ x ^Ij x ^ 


+ aPA 0 


as PA 0 then a — and thus implies that X is integrable, since, 
given « > 0 ， 






for a = a t sufficiently large. 

The integrals of r.v.’s X n are uniformly P-absolutely continuous or 
simply uniformly continuous \{^ \ X n \ — 0 uniformly in n as PA 0; 
in other words, for every € > 0 there exists a S t independent of n such 
that X n I < « for any set A with PA < S t . Let 5 n = [| | ^ a]. 

The r.v/s | X n \ are uniformly integrable ，if J | X» | — 0 uniformly in 

«， as a — oo. Observe that if the X n \ are uniformly bounded, say, 

by f(< oo), then, by Markov’s inequality, PB n ^ c/a — 0 as a — oo. 
Upon replacing X by X n and B by B n in the foregoing discussion, it 
follows that 


e. The r.v.'s X n are uniformly integrable if 、 and only if、their integrals 
are uniformly bounded and uniformly continuous. 

Let X n — X hence X u Ia XIa. It follows, by 9.3d and the above 

lemma (take A = and take A such that PA 0) 

f. If X n X, then the | X n | r are uniformly integrable. 


For use on the forthcoming theorem, note that (Young) 

The Fatou-Lebesgue theorem and the dominated convergence theorem re¬ 


main valid if therein Y and 7, are replaced by U n and V n with U n 
V n ^ V and {\J n ^ [Ufinite, (V„ — f Vfinite. 


a.c. 


u. 
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For then, the argument pp. 125-6 remains valid. Furthermore, by se- 

• ft e 

lecting {«"} p. 126 so that also U n ，，—> U, we have 

Ij\ X n \ S U n with U n U and J*U n — J*U finite, then X n X 
implies that — > J X in fact ^\X n - X\ 


0 . 


C. Lj-convergence theorem. Let the X n C L r . Then 

(i) X n ^ X if and only if (ii) X n ^ X 
and one of the following conditions holds: 

(iii) J *I X n | r —> J*\ X\ r < <x >； (iv) the \ X n | r are uniformly integrable; 
(v) the I X n I r , or (vi) the \ X n — X\ r , have uniformly continuous integrals. 

Proof. Let « > 0 be arbitrary, set A n = [\ X n — X\ ^ «], A mn = 
[I X m 一又 n I 2 «】， and let w ，《 — We use the f r -inequality without 
further comment. Note that (iv) implies X n e L r . 

Condition (i) implies (ii) by Markov inequality (PA n ^ E \ X n — 
X\ r /t r —> 0) and implies (iii) by 9.3d. Conversely, (ii) and (iii) imply 

(i), since then \ X n - X\ r ^ c r \ X n \ r + c r \ X\ r = U n with U n 


p 


2c r \ X\ r znd Ju n 2c r j\ X 


< 00 . 


As for the remaining assertions, (i) implies (iv) by f, and (iv) implies 

(v) by e applied to the | X n \ r in lieu of the X n . Also, clearly (i) implies 

(vi) , and (vi) implies (v), since it implies integrability of \ X n — X \ r 

hence of | | r (because X n C L r ) so that f \ X n \ ^ c r f \ X n — X \ r 


+ 


"X 1 




I r < « for PA sufficiently small. 


Thus, to complete the proof, it suffices to show that (ii) and (v) imply 
(i). Since convergence in pr. (in the rth mean) is equivalent to mutual 

convergence in pr. (in the rth mean) and X n X n ^*Y imply 

that Y = X a.s., we can replace (i) and (ii) by (i^ E | X m — X n | r — 0 
and (ii’) PA mn —>■ 0. The assertion follows since, upon integrating 
I X m — X n | r on A mn and on A mn c ^ (ii’) and (v) imply that as w ，《 — 0 

then t ->0,E\X n - X n | r ^ f f \ X m \ r c r f | X*| r + 〆 — (). 
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r r 

Corollary 1 . X n —^ X implies X n — X for r f < r. 
Set = [\ X n ^ X \ ^： l] and observe that 


乂 1 U ， 


u |， 



X n - X \ r， ^\\x n - X\ r +PA. 


p 

Corollary 2. If sup E\ X n \ r = c < <», then X n ^ X implies 
X n —> X for r' < r. 

Let = [| Xi I ^ a] and observe that 

f\X n \ r， = f I X n | r， + f I Zn | r， ^ f〆-” + a r PA < t 

JA JAA n c 

by taking a sufficiently large to have fa r ’ _r < - and, then, PA sufficiently 

small to have a r PA < - . 

2 

p 

Corollary 3. If \ X n \ ^ Y C . L r for large w, then X n — X im¬ 
plies X n — > X L r . 

Observe that for large «, f \ X„\ r ^ \ Y r . 

~ Ja 

We proved in 9.3 a particular case of this corollary, with Y = c < 


We summarize below the relations between various types of con¬ 
vergence: 

1 - 

x n x ^ X n ^ X X nk -^ XwkhZP \X nk -X\^~ <00 

1 k L 2 fc J 

it 

X n — > X ,= => X n — ^ Xy r f < r. 

The operation of integration on the complete normed linear space L r 
with r ^ 1 can be characterized as a functional of the integrand, as 
follows: 

1 1 

D. Integral representation theorem. Let — | — =1 with 1 ^ r 

r s 

< 00. 
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A functional f on L r is linear and continuous if^ and only if、there ex- 
/j/j a r.v. Y L, such that f{X) = EXY for every X C L r ; then f de¬ 
termines Y up to an equivalence and J|/1| : = E>\ Y\ 8 . 

. 1 1 

Proof. Since - + - = 1 and 1 ^ r < oo it follows that 1 < j ^ oo, 
r s 一 , 

and we apply repeatedly Holder’s inequality E\ XY\ ^ || X\\ r \\ y ||„ 

wjiere || X\\ r = E r \ X\ r and || y|| a = £*| y|* with || y||oo = lim 

£*| y |* = a.s. sup I Y\. * 

if || X ||r|| y ||« is finite, then f{X) = EXY exists, is finite, and de- 
fjnes a. normed functional / on L r with ||/|| ^ || Y\\ s . Since EXY is 
linear in C L ry so \s/(X). Being normed and linear, / is continuous. 

Conversely, let a functional/on L r be continuous and linear; linearity 
implies additivity and additivity implies / ⑻ = 0, where 9 is the zero- 
point of L r , that is, the class of r.v.’s degenerate at 0. Therefore, the 
set function 沪 on Cfe defined by <p(A) = f{I A ) is continuous and addi¬ 
tive, hence <r-additive, and vanishes for null events, hence is P-con- 
tinuous. Thus, the Radon-Nikodym theorem applies and <p on GL de¬ 
termines up to an equivalence a r.v. Y such that 

/{Ia) = <p{A) = EI a Y. 

Since/(J^) and EXY are both linear in X, it follows that /(X) = EXY 
for all^ simple finite X(C L r ). If Y C L s and L r 3 X n ^ X hence 

X n Y XY y then, by continuity of / and of E on Z, r , this equality 
extends to all X C L r . Since / has finite norm || / || ^ || y || a , to com¬ 
plete the proof it suffices to show that the reverse inequality || / || ^ 

|| Y || a is true. 

Let r > 1. If the X n are simple finite and 0 ^ | | y |, then 

. E\ X n I* ^ EiX^- 1 sign Y)Y^ ||/||^|^ n |(*一以 

yields 

II UklU ll/ll. 

Let r = 1. If there exists an c > 0 such that || YW^ ^ ||/|| + 2« 
and we set // = [| y I ^ ||/|| + «], then PA > 0 while 

(11/11 + ^)PA ^ E\ I a Y\ = E(I a sign Y)Y ^ \\f\\PA y 

and we reach a contradiction. This completes the proof. 

Remark. The definitions and results of this subsection extend at 
once to complex-valued r.v.’s. 
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§ 10. PROBABILITY DISTRIBUTIONS 

10.1 Distributions and distribution functions. Let be a r.v. on 
our pr. space (S2, Q, P). The nonnegative set function Px defined on 
the Borel field (B in /? by 


PxS = P[X csi SC (& 


is called the pr. distribution or, simply, distribution of X. Since X is 
finite, the inverse image under ^ of /? is S2 and, since the inverse image 
of a sum of Borel sets is the sum of their inverse images, we have 

PxR = 1, Px(Z Sj) = E PxS h SiC(R. 

Therefore, Px on (B is a probability. Thus, the r.v. X induces on its 
range space a new pr. space (/?, (B, Px) y to be called a pr. space induced 
by X on its range space or the sample pr. space of X. Moreover, 

a. The distribution Px of X determines the distributions of all r.v.'s 


g(X) where g is a finite Borel function on R; and 


Eg(X) 


: f R S^x in 


the sense that, if either side of this expression exists，so does the other, and 
then they are equal. 

Proof. Every finite Borel function g(X) of a r.v . 义 is a r.v. and, 
by definition, 

\s(X) CS] = [XC g-\S)] 
where S and g^iS) are Borel sets. Therefore 

Pg(.X)(S) = Pxg^iS), C 
and the first assertion is proved. 

The second assertion will follow if we prove it for nonnegative func¬ 
tions g. Because of the monotone convergence theorem, it suffices to 
prove it for nonnegative simple functions g and, because of the addi¬ 
tivity property of integrals, it suffices to prove the assertion for indi¬ 
cators. Thus, let g = Is, so that g(X) = I[x c si- But, then, the left- 
hand side of the asserted equality becomes 


f Ax c s^P = P[X C *S"], 

Jq 

while the right-hand side becomes I Is dPx = PxS. Therefore, by 

Jr 

definition of Px, the asserted equality holds, and the proof is complete. 
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Distributions are set functions and are not easy to handle by means 
of classical analysis developed primarily to deal with point functions. 
Thus, in order to be able to use analytical methods and tools, it is of 
the greatest importance to find, and learn to use, point functions which 
“represent” distributions, that is, which are in a one-to-one correspond¬ 
ence with distributions. Such functions are obtained by the correspond¬ 
ence theorem according to which, to the finite measure Px corresponds 
one, and only one, interval function defined by 

Fx[a ， 的 =Px[a y b) = P[a ^ X < i>) y [a, b) C R. 

In turn, to this interval function corresponds one, and only one, class 

of point functions on R defined up to an additive constant, by 

/ 

— F x (a) = Fx[a y b)^ a < b ^ R. 

Recalling that Px is the distribution of a r.v. X 、 we select among all 
those functions the function Fx defined on R by 

F x (x) = x) = P[X < ^], x C Ry 

and call it the distribution function {d.f.) of X. Then, according to the 
usual riotational convention, the equality in a can be written Eg{X)= 

I g dFx and, if g is integrable and continuous on R, then the right-hand 
Jr 

side L.-S.-integral becomes an improper R.-S.-integral. 

b. The d.f. Fx of a r.v. X is nondecreasing and continuous from the 
left on /?, with Fxi—^ 0 ) = 0 and •Fx(+ < ») = 1. Conversely, every func¬ 
tion F with the foregoing properties is the d.f. of a r.v. on some pr. space. 

Proof. The first assertion follows from the fact that P[X < does 
not decrease as x increases, approaches P{X < 〆】 as ^ | x\ and ap¬ 
proaches P[X = — oo] = 0 or P[X < + 00 】 =1 according as 尤 ——<» 
or ^ ^ + 00 - The converse follows by taking, say, for pr. space (R, 
(B, P) where P is the pr. determined, according to the correspondence 
theorem, by F. Then F is the d.f". of the r.v. X defined on this pr. 
space by X(x) = x, x C. R- 

Remark. There are pr. spaces on which there can be defined r.v.'s for 
every function F with the stated properties. 

For example, take for the space Q the interval (0, 1), for the <r-field of 
events the <r-field of all Borel sets in this interval, and for pr. the Le- 
besgue measure on this <r-field. Then any function F with the stated 
properties is the d.f. of an inverse function X of F. 
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The weakest type of convergence of sequences of r.v.’s considered 
so far is convergence in pr. In turn, it implies a type of convergence 
of d.f.’s, as follows: 

p 

If X n — X， then Fx n —> Fx on the continuity set C{Fx) of Fx- 
Proof. Since 

[X < x'} = {X n <x ) X< x'] -^-[Xn^x^K x f ] 

c [X n < x] [Xn x, X < x f ] } 

we have 

P[X < x'] ^ F Xn (x) + P[X n < x']. 

p 

1( X n — X —* 0, then, for x' < x y 

P[X n 2 at, Z < 〆]$ P[| 2 a;— 〆]— 0 

and, hence, 

Fx(x') ^ lim inf F Xn (x) ， x' < x. 

Similarly, interchanging X and X n , x and x f , we obtain 
lim supF Xn Cv) $ Fx 、 x" 、， x<x n . 

Therefore, for x' < x < x tf ， 

Fx(x f ) ^ lim inf F Xn (x) ^ lim supF Xn Cv) S F x (x f, ) 
and, if x C. C(Fx), it follows, letting ^ ^ and x" j x y that 

F x (x) = lim F Xn ( x )' 

The same argument with X' n in lieu of X and x\ x" C C(Fx) yields 

d. If X n - X' n ^ 0 and F x f n — Fx on C(Fx) y then F Xn —> Fx on 
C(F X ). 

Particular case. There is an important case in which convergence in 
pr. and convergence of d.f/s are equivalent: 

X n c tfy and only if y Fx n —> 0 or 1 according as x < c or x > c. 
Follows by c and d. 

First Extension. Let X — (Xi, .. . ， Xn) be a random vector or, 
equivalently, a finite class of r.v.’s Xi ， … ， Xn. The distribution of X 
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is defined on the Borel field (S> N in the iV-dimensional space R N = 

II by 

卜 1 Px(S) = P[XCS] y SC(R n . 

As for a r.v .， Px is a pr. and the induced pr. space is (R N } (S, N y Px). 
Proposition a, with its proof, continues to be valid: the first part holds 
for every finite Borel function g on R N to some W and the second part 
holds for every component of g. 

The distribution function (d.f.) Fx on R N of X is still defined by 
F x (x) = Px(~^ x) = P[X < ^], xCR N y 
or, more explicitly, by 

Fx u .",x n (x u … ， xn) = P[Xi < • • •, Xn < .vat]. 

Px determines the increment function of Fx and, conversely, by 
Px[a t b) = F x [a, b) = Ab-aFx(a) y a < C R N 

or, more explicitly, by 

P\ a i = ^1 < ^i> • * *, ^ Xn < ^n] 

=. ^bN—aN^Xir • 

where A 6t _ aw k = \ y • • •, A 7 , is the difference operator of step bk — 
operating on a^. 

Proposition b and its proof, as well as the remark, remain valid, 
provided Fx “nondecreasing’’ means that LjJFx ^ 0 for A > 0, that 
is, hi > 0, • • •, A at > 0, and x —> — <» or ^ > +oo means that one at 

least of the Xk —> — <» or that all the Xk —> +<», respectively. 

• . . . . p 

Proposition c and its proof remain valid, provided X n —* X means 

P 

that every one of the components X n k — Xk，k = \ y • • •, N. 

*Let X = {Xty f C. T] be an arbitrary random function or, equiva¬ 
lently, an arbitrary class of r.v.’s X t) t C. T. Then X induces the pr. 

space (R t , (B r , Px ) — its sample pr. space — where = II is the 

t e t 

range space of X y (S> T is the Borel field in R T y and Px is the distribution 
of X defined by 

Px(S) = P[X CSl SC(S, T . 

According to the consistency theorem, Px determines the consistent 
family of the distributions of all finite subfamilies (X tiy • •.， 

X tN ) of the family X and, conversely, a consistent family of distribu- 
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tions on Borel fields of all finite subspaces of R T determines a 

distribution on (B r . Similarly, the d.f. Fx on R T is defined by the con¬ 
sistent family of the d.f.’s F Xtv ... fXly of all finite subfamilies of the 

family X and, conversely, a consistent family of d.f/s on all finite sub¬ 
spaces of R t defines a d.f. on R T . 

Remark. So far, the numerical functions under consideration were 
r.v.’s, that is, finite (or a.s. finite) measurable functions. However, 
the preceding definitions remain valid for nonfinite measurable func¬ 
tions, provided the range-spaces__are__extended, that is, R y R k , R t = 
(-^j-f 00 ) are replaced by R, R ki R t = [- 00 , + 00 ]. Thus, say, R N 

一 N 一 

is replaced by = II 瓦 and, at the same time, (R N is replaced by 

fc * 1 ^ 

^> N — the Borel field in R N y and Px on (R N is replaced by Px on ^> N . 

To fix the ideas, let X be a numerical measurable function, not neces¬ 
sarily finite. Since S5 is determined by (B and the sets { — 00 } and {+<»}， 
Px on ® is determined by Px on (B and the values 

Px( — °°) = P[X = —OO], P_y( + 00) = P[X = +00]. 

In fact, Px on © is determined by the d.f. Fx of X, defined by 

Fx(x) = P[X < ^] = P.y[-°°, ■x'), x C R y 

since 

Fxi -00 ) = lim Fx(x) = P[X = — ^ 0 

X —♦一 00 

and 

^x(+°°) = lim Fx{x) = P[X < + 00 ] = 1 一 p[X = + 00 ] ^ 1. 

* —♦ +00 

10.2 The essential feature of pr. theory. We are now in a position 
to describe the essential feature of pr. theory as distinct from measure 
theory. 

While pr. concepts are born from experience and, in their rough form, 
are perhaps older than the measure-theoretic ones, yet their rigorous 
formulation was given in this chapter in terms of and by specializing 
the measure-theoretic concepts. Thus, it looks as if, nowadays, pr. 
theory were a part of measure theory or, conversely, as if measure 
theory were a generalized and rigorous pr. theory. Therefore, it is im¬ 
portant to point out the basic distinction between these two interlock¬ 
ing branches of mathematics. The fact is that the distinction does not 
lie in the greater or lesser generality of the concepts, but in the proper¬ 
ties investigated in these branches of mathematics. 
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Let us start with an analogy. Geometry, say, euclidean plane geom- 
etry, appears to be a part of algebra and analysis, since we can consider 
a point in a plane as an ordered pair (x f y) of reals or as a complex 
number, a straight line as a linear equation in x and etc. Yet, geom¬ 
etry remains a science per se, not because it has its own terminology or 
is older than algebra and analysis, but because geometry studies those 
properties of sets of points that remain invariant under all the trans¬ 
formations which, say, preserve the distances; for example, euclidean 
displacements in the case of the euclidean geometry. And geometric 
terminology developed, frequently unconsciously, for this specific pur¬ 
pose is, on the whole, well adapted to the geometrical intuition, prob¬ 
lems, and methods. 

Now, measure theory investigates families of functions on a measure 
space to other spaces, distinct or not from the first. On the other hand, 
pr. theory has developed and continues to develop the intuition, prob¬ 
lems, and methods of its own in exploring those properties of families 
of functions which remain invariant under all the transformations which 
preserve their joint distributions — the reason being that the primary- 
datum in random phenomena is not the pr. space but the joint distri¬ 
butions of the families of r.v.’s which describe the characteristics of 
the phenomena. Since the measurable characteristics are finite, pr. 
theory limited itself to r.v.’s (which, by definition, are finite). This 
explains the historical reason for the restrictions imposed on the meas¬ 
ure-theoretic setup of pr. theory. However, today pr. theory is suffi¬ 
ciently mature mathematically to show signs of.getting rid of those 
restrictions, by considering more general families of functions on meas¬ 
ure spaces (normed or not) to more and more abstract spaces. We can 
summarize the essential feature of pr. theory as follows: 

A PROPERTY IS PR.-THEORETICAL IF, AND ONLY IF, IT IS DESCRIBABLE 
IN TERMS OF A DISTRIBUTION. 

In other words, 

A property of a family ofjunctions on a measure space is pr.-theoretical 
t/y and only if, the property remains the same when the family is replaced 
by any other family with the same distribution. 

In particular, since in the numerical case a distribution is represented 
by the corresponding d.f.’s, we can say that 

— the pr.-theoretic properties of a r.v. X are those which can be expressed 
in terms of its d.f. Fx, 
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― the pr.-theoretic properties of a finite family (^i, X 2 , .. .， Xn) of 
r.v.'s are those which can be expressed in terms of the joint d.f. 
F Xu Xi, …, Xm 

— the pr.-theoretic properties of any family (Xt y i C. T) of r.v.'s are 
those which can be expressed in terms of the joint d.f :s of its finite 
subfamilies. 

More generally, consider a function X on pr. space (12, Cl, P) to 
some abstract space fl'. The class of all sets in whose inverse images 
under X are events is a <r-field Q! in assign to A' C Of the number 
P'A' = P{X~ l A'). This defines the induced pr. space (fl'，d'， P'). 
The pr.-theoretic properties of X are those which can be expressed in terms 
of P' on Of. If we limit ourselves to these properties only, we can speak 
of a “stochastic variable'' X described by a “pr. law" represented by P'. 
Those are the mathematical beings we are concerned with, and the 
function X y the measure P f (or the d.f.’s in the preceding cases) are 
only various ways of talking about those beings in various languages. 
It is important to realize fully that measurements of a stochastic varia¬ 
ble are relative to the induced pr. space; the original pr. space is but a 
mathematical fiction. Yet it is basic, for it permits the use of a u com- 
mon frame of reference” for the families of stochastic variables we in¬ 
vestigate — the families of sub <r-fields of events they induce on the 
original pr. space. However, precisely because of the existence of a 
common frame of reference in the present setup, modern physics forces 
us to introduce a different setup that we shall see in the next volume. 


COMPLEMENTS AND DETAILS 


Notation. Unless otherwise stated, the pr. space (il, d, P) is fixed, the 
spaces L r , L 3 (r, s > 0) are defined over the pr. space, and, with or without 
affixes, J, B , … denote events, while X, Y, ••- denote r.v.'s. 

1 • Rewrite in pr. terms as many as possible of the complements and details 
of Part I. 

2. The convex function log E\ X\ r oi r is linear if, and only if, JSf is a degen¬ 


erate r.v. 

3. Liapounov’s inequality. Let fi r = E\ X\ r . If r ^ ^ ^ 0, then 

Mr •卜 ^ 1. When does this inequality become an equality? Prove 

Holder’s inequality by means of properties of convex functions. When does 
this inequality become an equality? ^ 

4. Investigate the possible behaviors of E^\ X \ r r varies from 一 oo to 0. 

5. Apply Markov’s inequality to X - - - to obtain a bound for 

P[a ^ X ^ b). Also use the method of proof of the basic inequalities to obtain 
various bounds for this pr. 
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6. If go on [0, +oo) is a nonnegative Borel function such that ^o(^) ^ 兄 o(e) 
for x^e y then P[| JiT | g e] S Ego(\ X |)/^o(€). Construct a function g on 
[0 ， +oo) with ^(0) = 0, g(e) = go(^) y which is nondecreasing, continuous where 
fo is continuous, and such that Eg(\ X \) ^ Eg 0 (\ X |). Then the above bound 
is at least as sharp with g instead of go. 

(Form gi(x) = inf g(x') for x' ^ x and g(x) = min (gi(x), -go(x)).) 

7. Let g with ^(0) = 0 be a continuous and nondecreasing function on [0, +oo). 

If there exists an A = h{Eg{\ X |), e) such that P[| | ^ e] ^ A ^ Eg{\ X \)/g{t) 

for all r.v.'s X } then h = Egi\ X |)/ 《 (e) for those e > 0 for which the bound is 
of interest, that is, for which Eg{\ 义 |) < g(e). Loosely speaking, the bound 
Eg^\ X \)/g(e) is the sharpest of all bounds which depend upon Eg(\ X |) and e. 

(Take | X \ = e or 0 with pr. p and q = ^ — p (pq 9 ^ 0), respectively.) 

8. For e > 0 sufficiently small, the bound E\ X \ r /t r is at least as sharp as 
the bound E\ X \ a /i a with s > r. 

9. Let 

MX, Y) = \n({P[\X -Y\^ e) + e\ for all e > 0； 

di{X, Y) = inf e such that P[| JSf — F | ^ e] < e ； 

di{X, Y) = Eg (I 'X — Y |),5 on [0, + «) is bounded continuous and increasing 

• cx 

with 左 ⑼ = 0 and《(x + x ’） 芸 g(x) + 《 (〆 )； for instance, take ?(x)=— - 

• _ \ + cx 

with c > 0 y g(x) = 1 — e~ x 9 or g(x) = tanh x. 

Each of the three functions do y d\ y is a metric on the space of all r.v.'s, 
provided equivalent r.v/s are identified. Convergence in pr. is equivalent to 
convergence in any of the corresponding metric spaces. 

^0. (a) 52 I I < 00 a.s. if, and only if, the sequence of d.f/s of consecutive 
sums converges to the d.f. of a r.v. 

(b) If£El^n| r <<«, thenEl^nl^ooa.s. 

(c) Let s — \ or - according as r < 1 or r ^ 1. If £ a \ | r < °°, then 
Y,\X n \ < a.s. 

P • • • • 

11. X n X if, and only if, given e > 0 and 5 > 0, there exists n(e y d) such 
that P[\X n - X\^ e]<8forn^ n(e y d). 

(a) X n —— > X if, and only if, given € > 0 and 5 > 0, there exists n(e y Si) 
such that P[\ X n — X \ ^ e for some n ^ n(e y 5)] < 5. 

(b) X n ^ X except on a null event if, and only if, given € > 0 there 
exists n(e) such that P[| Jf n — | ^ e] == 0 for n ^ n(e) or, equivalently, 
P[\ X n — X \ ^： e for some n ^ n(e)\ == 0. 

12. P[X n + > - lim lim 尸 U [I 易一义 I 2 4 

€ —► 0 n —► oo k^n 

(a) If X /*[| — 义 I 2 e] < oo for every e > 0, then X n —> X. 

n 

(b) If X) ^1 ^ | r < 00 for some r > 0, then X n —> X. 

13. X n —> X if, and only if, there exists a sequence € n 一 0 such that 
P (J [| Ar* — I ^ €*] —> 0. (For the “only if” assertion select | oo by 

k 么 n 

P U and take € n = — for n m ^ n < w m +i.) Let 

D be the set where the sequence X n does not converge to a finite function. 
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PD = lim lim lim P U 1| 石一义 | 2 e] 

• — *07 »— k m 

PD = lim lim , lim P U [| 石 一 Xn | 2 e] 

«—>0 m— k 爾 m 

where lim / denotes lim inf or lim sup indifferently. Can lim / be replaced by 
lim? 

14. (a) If Ya P[X n ^\ — I ^ € n ) < oo and Z) h < °°， then the sequence 
X n converges a.s. to a r.v. 

(b) If X) sup P(| Xn^-p — X n \^ e] < oo for every € > 0, or 
v 

supP[|J ^+ p -厶 €】 — 0 and Elim \nf P[\ X n ^ P - X n \ ^ e) < oo 

p p 

for every € > 0, then the sequence X n converges a.s. to a r.v. (In the last two 

p 

cases, X n — > some r.v. X and P[\ X n ^ X \ ^. 2e] is bounded by the corre¬ 
sponding term of each of the two series.) 

15. Take X n = n c or 0 with pr •- and 1 - , respectively, and investigate 

n n 

convergences of the sequences X n and £| | r according to the choice of c and 
of r. 

16. If Fx n Fx on C(F X ) and Y n c, then Fx n +Y n — Fx+ C on 
C(Fx+c) {Slutsky). 

What about X n Y ny X n /Y n and in general g(X ny Y n ) where g is continuous? 
(Use lO.ld.) 

17. Take J^ 2 n-i = - , X^ n = — - and investigate the sequences X n and F Xn . 

n n 

Take X n = 0 or 1, each with pr. and == 1 or 0, each with pr. Then 
I X n — X \ = \ but Fx n = F- To what converse is it a counterexample? 

18. If the sequence X n converges a.s. to a nonfinite function, what can be 
said about the sequence F^ n ? 

19. Let \F n ] be a denumerable family of d.f.’s with F n ( —°o) = 0 and 

Fn(+°°) ^ 1. The family of all functions F nu ... nm = F n , X• • • X is a con¬ 
sistent family of d.f/s. Construct as many pr. spaces as you can, on which are 
defined r.v.’s X n such that F Xn{ .•… 5=2 all finite index sets. 

Extend what precedes to a family where / ranges over an arbitrarily 
given set T. 

20. There is no universal pr. space for all possible r.v/s on all possible pr. 
spaces. 

21 • Extend as much as possible of this chapter and of the foregoing comple¬ 
ments and details to complex-valued r.v.’s and to complex vectors, by suitably 
interpreting the symbols used. 







Chapter IV 


DISTRIBUTION FUNCTIONS AND 
CHARACTERISTIC FUNCTIONS 


§ 11. DISTRIBUTION FUNCTIONS 

11.1 Decomposition. In pr. theory, a distribution function {d.f.) y to 
be denoted by F, with or without affixes, is a nondecreasing function, 
continuous from the left and bounded by 0 and 1 on R. This defini¬ 
tion entails at once that the quantities, 

F(-oo) = lim F(x) = infF, F(+oo) = lim F(x) = sup F, 

X — oo 35 —♦ qo 

F(x) = F(x — 0) = lim F(x n ) = sup 尸(〆)， 

Xn^ X X , <X 

+ 0) = \lmF(x n ) = infF(x% 

Xnlx X f >X 

exist and are bounded by 0 and 1, and ^ is a continuity or a discontinu¬ 
ity point of F according as F(x + 0) — F(x — 0) = 0 or > 0. As we 
have seen, a d.f. is always the d.f. of a measurable function on a pr. 
space, and if F( — oo) = 0, _F(+oo) = 1， then it is the d.f. of a r.v. 

The requirement of continuity from the left is of no importance, 
since every nondecreasing function F\ on R bounded by 0 and 1 de¬ 
termines a d.f. F by setting F(x) = Fi(x) or F(x) = Fi(x — 0) accord¬ 
ing as x is a continuity or a discontinuity point o(Fi. In fact, even less 
is necessary to determine a d.f. 

Let D denote a set dense in R (for example, the set of all rationals) 
and let Fd denote a nondecreasing function on D bounded by 0 and 1. 
We can assume, without loss of generality, that it is continuous from 
the left on D. Since, for every x C. R» there exists a sequence C： D 

m 
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such that x n f it follows easily that, according to the defini¬ 

tion of d.f.’s, 


a. The junction F defined on R by 


F(x) = WmFzfixn), x n C D ， x n < x 
T f 


is a d.f. 

It follows that, if two d.f.’s coincide on a set dense in R y they coincide 
everywhere. Furthermore, monotoneity of d.f.’s leads to the 


A. Decomposition theorem. Every d.f. F has a countable set of dis¬ 
continuity points and determines two d.f.'s F c and Fd such that F c is con- 
tinuouSy Fa is a step-function，and F = F c -Fd. 

Proof. If F has at least n discontinuity points Xk 


a ^ xi < x 2y - '■ y < x n < 


in a finite interval [a, b) y then, from 

F{a) ^ F{x x ) < F(xi +0) S... S F(x n ) < F(x n + 0) ^ F{b) y 
it follows, setting p(xk) = F(xk + 0) — F(xk), that 


ZpM = E \F( Xk + 0) — F(x k )} ^ F{b) - F(a). 

hssx\ A;=»l 

Therefore, the number of discontinuity points x in [a, b) with jumps 

p(x) > e > 0 is bounded by - {F{b) — F{a )}. Thus, for every integer 

6 * 

. 1 . 

m y the number of discontinuity points with jumps greater than — is 

m 

finite and, hence, there is no more than a countable set of discontinuity- 
points in every finite interval [a, b). Since is a denumerable sum of 
such intervals, the same is true of the set of all discontinuity points, 
and the first assertion is proved. Furthermore, denoting the discon¬ 
tinuity set by {^ n }, we have, for every interval [a, b) y finite or not, 

E PM ^ F{b) - F{a). 

a ^x n < 0 

Upon defining Fd by 

Fd(x) = E pM, x CR, 

Xn<X 

and setting F c = F — Fd, it follows at once that Fa and F c are d.f.’s. 
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But, for x < 

F c {x') - F c {x) = F{x') - F(x) - E PM 

x gx n < x f 

= F(x f ) - F(x + 0) - E pMy 

X< X n < X f 

so that, letting x' ]. x y we obtain 

F c (x + 0) - F c (x) = 0; 

thus F c is also continuous from the right and hence continuous. 

Finally, if there are two such decompositions of F y 

F = F c + F d = F， c + F ， d ， 

then F c — F’ c = F’d — Fd ，and both sides must vanish since the left- 
hand side is continuous while the right-hand side is discontinuous, ex¬ 
cept when it vanishes identically. This completes the proof. 

Remark. Since the discontinuity set of a d.f. is countable, its con¬ 
tinuity set is always dense in R. However, the discontinuity set can 
also be dense in R. For example, let {r n } be the set of all rationals in 

R (it is dense in R); if p{r n ) = \ • then the function F defined by 

7T Tl 


= E P(fn)y XCR, 
r n < x 


is a d.f. and, in fact, is the d.f. of a r.v., since — 



= 0 and ^(+ 00 )= 


Further decomposition. F c determines, by ； u c ( —00, x) = F c (x) — 
F c ( —oo), a finite measure n c on the Borel field (B in R. Upon applying 
to He the Lebesgue decomposition theorem with respect to the Lebesgue 
measure on (B we obtain 

Me = Mac + y - ac { S ) = I g(x) dx y 5 C ®, 

where ^ ^ 0 is a Borel function and ju* = 0 on the complement of some 
Lebesgue-null set N„. It follows that there are d.f.’s F ac and F, which 
correspond to the measures y. ac and respectively, such that 

F c = F ac + F ay F ac (x) =J g(x)dx } g^Oy 

and F t is a continuous d.f. whose points of increase all lie in N t . Thus 
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A'. Every dj. F determines three d.j's of which F is the sum: 

— the step part Fd which is a step Junction t 
― the absolutely continuous part F ac such that 

Fac{x) =J g{x) dx, g^Oy x CR, 

― the singular part F a which is a continuous function with points of 
increase all belonging to a Lebesgue-null set. 

11.2 Convergence of d.f.’s. As 10.1c and 11.1a suggest, convergence 
of d.f.’s to a d.f. F ought to be defined without taking into account 
what happens on the discontinuity set of F. 

We say that a sequence F n of d.f.’s converges weakly to a d.f. F and 

write F n F, If F n F on the continuity set C(F) of F. This defi¬ 
nition is justified — that is, the weak limit, if it exists, is unique, since 

F n ^ F and F n F' imply F = F' on the set C(F) PI and, on 

the remaining set, which, by ll.lAj is countable, F = F' by continuity 
from the left. 

We say that a sequence F n of d.f.’s converges completely and write 

C W 

F n —> F } if F n —> F and -FnC^ 00 ) F (干 °°). Weak convergence does 

not imply complete convergence. For example, given a d.f. F 0 with at 
least one point of increase so that F 0 (—<x>)〆 -fo(+°°)} let F n (x)= 
F 0 (x + n). Then F n —» ^ 0 (+°°) and the weak convergence holds but 
not the complete convergence. However, in the case of weak conver¬ 
gence we have 

a. Let F n F. Then 

lim sup F n (— 00 ) ^ F( — 00 ) ^ F(+oo) $ lim infF n (+oo), 

Var F ^ lim infVarF„ 

and F n F if i and only if, Var ^ —> Var F or Var F n — —a, - \-a) 

—> 0 uniformly in n as a 00 . 

For, from 

Fn(- X ) ^ F n (x) ^ F„(+oo), 
it follows that, for x C C(F), 

lim sup F n ( — 00 ) ^ F(x) ^ lim inf F n (-\-<x>) 
and, letting x —> 干 00 along C(F), the first inequalities are proved. 






[Sec. 11] DISTRIBUTION AND CHARACTERISTIC FUNCTIONS 


181 


Thus 

Var F = F(+oo) — F(—eo) ^ lim inf (F n (+°°) — Fn( — °°)) 

=lim inf Var F ni 

and the second assertion follows from the same inequalities. 

We still have to find a way to recognize whether a given sequence 
F n of d.f.’s converges, weakly or completely. 

b. A sequence F n of d.j's converges weakly if y and only i/ } it converges 
on a set D dense in R. 

Proof. The “only if” assertion follows from the fact that the con¬ 
tinuity set of a d.f. is dense in R. As for the “if” assertion, let Fd = 
lim F n on D. The relation of 11.1a determines a d.f. F on R. Since, 
for x' < x < x'\ 

F n (x f ) ^ F n (x) ^ F„Cv"), 
it follows that, for x', x" C D 

Fd{x') ^ lim inf F n {x) ^ lim sup F n (x) ^ Fd{x"). 

Taking x C C(F) and letting x' \ x and x" x along Z), we obtain 

F(x) = lim F n (x) y xC C(F), 
and the “if” assertion is proved. 

We are now in a position to prove the basic Helly 

A. Weak compactness theorem. Every sequence of d.J.’s is weakly 
compact. 

We recall that (at least here) a set is compact in the sense of a type of 
convergence if every infinite sequence in the set contains a subsequence 
which converges in the same sense. 

Proof. It suffices to show that, if F n is a sequence of d.f.’s, then there 
is a subsequence which converges weakly. According to b, it suffices 
to prove that there is a subsequence which converges on a set D dense 
in R. ' ^ 

Let D = {^ n } be an arbitrary countable set dense in R, say, the set 
of all rationals. All terms of the numerical sequence F n (xi) lie between 
0 and 1 and, therefore, by the Bolzano-Weierstrass compactness lemma, 
this sequence contains a convergent subsequence F n i(xi). Similarly, 
the numerical sequence -F n i(^ 2 ) contains a convergent subsequence 
F n2 (x 2 ) and the sequence F n2 (xi) converges, and so on. It follows 
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that the “diagonal” sequence F nn of d.f.’s，contained in all the subse¬ 
quences {Fni}, {Fnz}^ - - •, converges on D, and the proof is complete. 

B. Complete compactness criterion. A sequence F n of d.f.’s is 
completely compact ij、and only tf y it is equicontinuous at infinity.. Var F n — 
F n [ — a i -\-a) —> 0 uniformly in n as a —> +». 

Proof. The “if” assertion is immediate. As for the “only if” asser¬ 
tion, if the F n are not equicontinuous at infinity, then, by a and A, there 
exists a subsequence F n > which converges weakly but not completely. 
Note that our “complete” convergence is frequently called “weak” and 
our “weak” is sometimes replaced by “vague.” 


11.3 Convergence of sequences of integrals. Let g denote a func¬ 
tion continuous on R and let F y with or without affixes, denote a d.f. 
We intend to investigate conditions under which weak or complete con¬ 
vergence of a sequence F n implies convergence of the corresponding 

sequence of integrals ^gdF ni when these integrals exist. Let us ob¬ 
serve that these integrals do not change if arbitrary constants are added 
to the d.f.’s. The investigation is centered upon the basic 

a. Helly-Bray lemma. If F n F up to additive constants t then ， 
for every pair a < b such that F n {a) —> F(a) and F n {b) —> F{b) i 


gdF n 


gdF. 


P —. Settmg ^ = 妒也，-一 where 

a = x m \ < x m2 <... < x m ,k„+i = 


and A m = sup (^ m ,jfc + i — x mk ) — 0 as w — <»， we have, according to 

k ' 

the definition of R.-S. integrals, 



Upon selecting all subdivision points x m k to be continuity points of F, 
it follows from F n ' F that, for every m and every 是 ， as w — oo 

Fn\ x mki ^w.Jk+l) ~ 
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and, hence, 

r b k m k m 

I gm.dF n = 22 S{ x mk)F n [x mki — > JZ g( x mk)F[x m k, X m , k+ i) 

Ja k=l k=l 


Since 


g^F n - I gdF 


= f (s — gm) dFn +Jgm dF n —J gmdF+ J (g m — g) dF 

and the first and last integrals on the right-hand side are bounded by 
SU P I s( x ) ~ Sm(x) I —> 0 as w oo the assertion follows by letting 

w —> oo and then w —> oo. 

The extensions of this lemma will be based upon the obvious inequality 

C 1 ) I Jg ~Jg^F\ ^\Jg dF n ~J g dF n I 


f S <^F — J* g dF n I + I J* — J* g< 


with a and b continuity points of F y provided the integrals exist and 
are finite. 

A. Extended Helly-Bray lemma. " 足 ( 千① ）= 0, then F n 二 F up 
to additive constants^ implies f g dF n — ^gdF. 

Proof. Since g is continuous and its limits as *v — 干 oo exist and 
are finite, g is bounded on R and the integrals JgdF n and ^g dF ex¬ 
ist and are finite. Letting « —> oo and then a —> 一 oo, 彡— +<»， it 
follows that, out of the three right-hand side terms in (I), the second 
converges to 0 by the Helly-Bray lemma, whereas the first and the 
third ones are bounded by sup | g(x) \ —> 0. The assertion is proved. 
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• C 

B. Helly-Bray theorem. If g is bounded on then F n F up 
to additive constants implies j*g dF n 一 dF. 


Proof. Since | g [ ^ c < oo, the integrals exist and are finite. Letting 
w —> oo and then a —> —oo, b —> +<», it follows that, out of the three 
terms on the right-hand side of (I), the second converges to 0 by the 
Helly-Bray lemma, whereas the first and the third ones are bounded, 
respectively, by 

cjVarFn - F n [a } ^ 0 and c{V^r F - F[a, b)} ^ 0; 


and the assertion follows. 

Remark. All the results of these subsections extend, without further 
ado, to d.f.*s F on R N and continuous functions g on R N y with the usual 
conventions for the symbols used above. 


*11.4 Further extension and convergence of moments. Let 《 on 及 be 
continuous and F on R, with or without affixes, be a d.f. The integrals 
we are interested in, are finite Lebesgue-Stieltjes integrals of the form 


J*gdF, that is, such that Jl 兄 1 dF < 的 ' they are, therefore, absolutely 

convergent improper Riemann-Stieltjes integrals. 

We say that | ^ | is uniformly integrable in F n if, as a — oo } ^ > 

+ °°，L \ S \ dFn ^ jl 岌 I 此 <oo uniformly in n; in other words, 

given € > 0 ， 

f\g\^n \g\dF n < t 


for a a t and b ^ b t independent of n. Since I | g \ dF n does not de- 

Ja 

crease as 这丄 一 <» and/or b | + 00 ， it suffices to require the foregoing 
conditions for some set of values of | ^ | and b going to infinity; for ex¬ 
ample, that I I g I dF n —> 0 uniformly in « as with 

J\x\^c m 

W 00. 

We consider now properties of the foregoing integrals which follow 
from the weak convergence of d.f/s F n \ they contain the extensions of 
the Helly-Bray lemma of the preceding subsection (we leave the verifi¬ 
cation to the reader). 
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A. Convergence theorem. IJ F n ― > F up to additive constants t 
then 

(i) lim infj*| ^ I dF n ^ ^\g\dF 

(ii) |^| is uniformly integrable in F n => U dF ^U dF 

(Hi) ^*1^1 dF n dF < oo <=> | ^ | is uniformly integrable in F n , 


Proof. Let 土 f be continuity points of F, and use repeatedly the 
Helly-Bray lemma. 

(i) follows, by letting w — oo and then c +oo, from 


J\g\^F n ^J \g\dF n ^ J \g\dF ^\g\dF. 

(ii) is proved as follows: 

Given € > 0, let I | 《 | dF n < € for c ^ c € whatever be n. By 
the Helly-Bray lemma, if〆 > c and 土〆 (like ±c) are continuity points of 
F ，then I | g \ dF < t and, letting c' —> <x> we have f I p-1 dF 

Jci\x\<c' 6 J|x|g; C 1 

< c and hence ^*| ^ | dF < oo. Furthermore, by taking c ^ c f and 
letting w —> oo and then c —> 0, 

|J* gdF n —JgdF\ ^ I g I dF n + \J gdF n 

-J gdF c \g\dF- ， 0. 

(iti) ==> follows from 


J xlgc \g\^ n ^\j\g\JF n -f\g\dF\ 

\g\^F+\j \g\^F-J^ \g\dF n \ 

by taking c = Co such that the second right-hand side term is less than 
c/3, then n n Q such that the first and the third right-hand side terms 
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are less than c/3, and finally c f = max (f 0> <^u ...，（n 0 _i) where Ck 
(是 = 1， ... ” 0 — 1) are such that I | g | dFk < c; thus, 

J\ X |^Cjb 

I g I dF n < t 

for c c t whatever be n. 

(iii) <= follows by (ii) where g is replaced by | ^ |. 

This proves the last assertion and terminates the proof. 

Application. Let 

m w = j*x k 是 = 0, 1，2， ...， // r) = -v | r dF{x) i r ^ 0 

define, respectively, the 是 th moment (if it exists) and the rth absolute 
moment of the d.f. F or, equivalently, of the finite part of a measurable 
function X with d.f. F\ if X is a r.v.，then this definition coincides with 
that given in 9.3. If F possesses subscripts, we affix the same subscripts 
to its moments. 

B. Moment convergence theorem. IJ y for a given r 0 > 0, | | r ° 

is uniformly integrable in F ni then the sequence F n is completely compact 

C 

and i for every subsequence F n > —> F and all k, r ^ ro, 

m n ， w — m (k) finite 、— /x ⑺ finite. 

Proof. According to the weak compactness theorem, there is a sub¬ 
sequence F n > and a d.f. F such that F n > F. On the other hand 5 the 
uniformity condition for | x | ro implies that, for every r ^ r 0 , 




uniformly in n', so that the uniformity condition holds for | x | r . There¬ 
fore, the preceding convergence theorem applies to every sequence 
m n ，( k 、and Mn' (r) with k y r ^ r 0 . In particular, taking r = 0, we obtain 

VarF n ，一 > Var F, so that F n ， — F. The theorem is proved. 


Corollary. If the sequence Mn (ro+,5) is bounded for some 5 > 0, then 
the conclusion of the foregoing theorem holds. 


For Mn (ro+S) ^ a < implies that, as r — + 00 , 
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This corollary yields at once the following solution of the celebrated 
“moment convergence problem” (Frechet and Shohat). 

C. I/y for k ^ arbitrary but fixed、the sequences m n 、 k 、 — m ⑻ finite 、 
then these sequences converge for every value of k y and their limits m ⑻ are 
finite and are the moments of a dj. F such that there exists a subsequence 

F n > — > F. 

Ifi moreover^ these limits determine F up to an additive constant, then 
F n — F up to a” additive constant. 


It suffices to apply the foregoing corollary and to observe that, if the 
7n {k) determine F up to an additive constant, then all completely con¬ 
vergent subsequences F n > have the same limit d.f. F up to additive 
constants. 

*11.5. Discussion. A d.f. F determined up to additive constants 
corresponds biunivoquely to an interval function F determined by 
F[a i b) = F{b) — F(a) which in turn corresponds biunivoquely to a 
measure F on the Borel field in R (4.4a) — a subprobability {subpr.) 
since F(R) ^ 1. 


Weak convergence of d.f/s F n to F — all determined up to additive con¬ 
stants, is equivalent to convergence of interval functions defined by 
F n [a, b) —> b) for every F-continuity interval [a, b), that is with F{a) - 

w 

F\b) =0， and we can still write F n —> F. The above appearance of 
subpr.’s permits to extend propositions in 11.3 and 11.4 to noncontinuous 
functions g. Since these propositions derive from Helly-Bray lemma 
11.3a, it will suffice to generalize it and the others will follow as before. 
Denote by D 0 the set of discontinuities of a function ^ on to R; it is a 
Borel set (see §12). If F(D a ) = 0 we say that g is F — a.e. continuous. 


a. Generalized Helly-Bray lemma. 1/ F n ^ F then J* gdF n —> 

J* g dF n for every F-continuity interval [a, b) and every F-a.e. continuous 
junction g bounded on every bounded interval. 


Proof. The method of proof of the Helly-Bray lemma in 11.3 applies 
but for one necessary change due to the fact that our integrals are now 
Lebesgue-Stieltjes ones so that instead of Riemann sums we use Darboux 
sums: Instead of we need g m and g m defined by 

灸 m 

= 二 Smklmky S m = 53 Smk^mky 
ib—1 A:—l 
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where I mk are indicators of F-continuity intervals J m k = x m ,k+i) 


•• T7* 

of length I Jmk I with E Jw = [tf, 多 ) ， sup| | -^ 0 as m — 

ik-1 k 

where 

gmk = inf{^(^) G JmJfc}, gmk = SUp C J OT Jfe}. 

Since as w — 00 ， by hypothesis, F n (J mk ) — F(J m k) so that 


,and 


r 

** o 


n ~~~^ 

while F(D 0 ) = 0 implies that F-a.e., as m 

S m T 兄丄 仏 ; 

letting « — oo then w in 

f g m dF n ^ f 


gm dF, J g m dF n - > J fm dF y 


g dF n ^ I g m dF ny 


it follows that 


g dF n 


— > 


r 


g 


The lemma is proved. 

So far we considered only numerical functions g. But all proposi¬ 
tions in 11.3 and 11.4 as well as the one above remain valid for complex 

valued ^ = (R^ + iZg .by, say, ^gdF = f(6lg)dF+i f(3g)^F. In 

fact, then, the inverses of the Helly-Bray lemma and of the Helly-Bray 
theorem are valid because of the weak and complete convergence criteria 
in 13.2. We shall leave these immediate extensions to the reader. 


Several questions arise at once: Since Borel fields are generated by the 
class of open (of closed) sets, are subpr.’s determined by their values on 
such a class ? Is weak convergence determined by the behaviour of 
subpr.’s on open (on closed) sets? Since weak and complete convergence 
are determined by convergence of integrals of some families of functions 
are there other such families ? 


It will be convenient to discuss these questions for subpr.’s on Borel 
fields of metric spaces. First, because this generality is needed for 
“functional limit theorems” (see Chapter XII) and second, because the 
proofs are not more involved than for the real line. However, this 
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generality creates two difficulties: First, we do not have intervals in 
general metric spaces hence no interval functions and are reduced to 
work directly with subpr/s. Second, nontrivial continuous functions g 
vanishing at infinity (that is, such that given 6 > 0 there is a compact K 
with |^| < € on K c ) may not exist. In fact, on the separable Banach 
space C[a y b\ of continuous functions on [a y b\ to R with the supremum 
norm, the only continuous function vanishing at infinity is the zero func¬ 
tion. Or this space is central to Ch. XII. Thus the extended Helly-Bray 
lemma is useless. However, the Helly-Bray theorem, with integrals of 
bounded continuous functions, with respect to subpr/s /x n , /x, remains 
meaningful. But，in the case of the real line, it corresponds to complete 

C a 

convergence /in m or, equivalently, weak convergence of pr/s fi n /fi n (R) 
to a pr. (excluding the trivial case of fi(R) = 0). Thus, in the 

general case we are led to consider only weak convergence of pr.'s to a pr. 
and the corresponding “relative compactness ”： As is easily seen, 11.2b 
implies that a sequence of pr.'s F n on R contains a subsequence which 
converges weakly to a pr. if and only if for every € > 0 there is a compact 
K € in R with F n (Kf) < € for all n. Is there a similar criterion for metric 
spaces ? Answers to the foregoing questions are to be found in the next 
section. 

*§12. CONVERGENCE OF PROBABILITIES ON METRIC SPACES 

Throughout this section and unless otherwise stated, with or without 
affixes 

1. 9C is a space with metric d and Borel field S generated by the class 
of its open (of its closed) sets, U y C y K are its open, closed, compact sets, 
respectively, and dA = A — A° \s the boundary of a set A in 9C. Proper¬ 
ties of metric spaces in 5.3 are to be used without further comment. 

2. P is a pr. on S and // in 9C is a P-continuity set when P{dA) = 0, 
g y h are Borel functions on the Borel space (9C, S) to the Borel line or 
Borel space (9C’ ， S') ， respectively. D 0 is the discontinuity set of g and g 
is P-a.e. continuous when P(D 0 ) = 0; similarly for h. Kg = Ia then 
clearly D 0 = dA. Note that for any function h on (9C, d) to (9C’ ， J’ ）， Z ) 九 
is a Borel set, since — KJ C\ D rs where r and s vary over the rationals 

and Dra are the open sets 

D r $ = [x: d{x y y) < s y d{x y z) < s y d f {h{y) y h(z)) ^ r}. 

For later use, we observe that except for a change of notation the same 
proof as for 10.1a yields 
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Change of variable formula. Let P on % be a pr. Let h be a Borel 
function on X to 9C , andg be a Boreljunction on X r to R. The distribution 
Ph~ l of h defined by Ph~\A') = A' G 备， -Borel field in 9C , 

determines the distribution of random variables g(h) y and 


f g{h) dP = ^ g d{Ph~^) y 

in the sense that if either integral exists so does the other one and then both 
are equal. 

The main concepts and results of this section originated with Alex¬ 
androv and their final form is primarily due to Prohorov. 

*12.1 Convergence. The basic theorem below is essentially due to 

Alexandrov. Any of its six equivalent properties defines weak convergence 

• w s 

on% of pr.’s P„ to a pr. P y and we write P„ —» P. The usual definition is 


(ii): ^gdP n ^ Jg dP for all bounded continuous functions g. Since 
1 = P n (9C) —^ P(9C) = 1, this “weak” convergence is in fact complete con¬ 


vergence. 

A. Convergence criteria. Let P ny P be pr.'s on the Borel field S 
of a metric space (9C, d). Let g be functions on X to R and the integrals be 
over 9C. w 

The following six properties are equivalent and define P n —» P ： 

I: ' 


^ ^gdP 

(i) for all bounded P-a.e. continuous g 

(ii) for all bounded continuous g 

(iii) for all bounded uniformly continuous g 

II: * ^ 

(iv) limsup P n C ^ PC for all closed sets C 

(v) liminf P n U ^ PUfor all open sets U 

(vi) P n A —» PA for all P-continuity sets A 


Proof. Clearly (i) (ii) (iii). 

(iii) (iv )： The function g m defined by g m (x) = e~ md< - x,C) is bounded 
by 1 and uniformly continuous with /c ^ gm | /c as w —» <» . Thus 


P n C ^ J'gm dP n and, by Fatou-Lebesgue theorem, as » —»°° then 

limsup P n C ^ f g m dP —» PC. 


m 
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(iv) => (v): The two properties are dual : each implies the other one 
by complementation. 

(v) => (vi): Since (iv) and (v) are equivalent, by using both and the 
fact that y/° C ^ C we obtain 

PA° ^ liminf—Pn/f 5 $ liminf P n A ^ limsup P n A ^ limsup P n A ^ 
PA. Since P{A — A°) = P{dA) = 0 by hypothesis in (vi), wc have 
PA° = PA so that in the above inequalities the extreme terms hence all 
the terms coincide and PA = lim PA n . 

(vi) =f (i): The method of proof of Helly-Bray lemma in 11.5 still 
applies but with another necessary change due to the fact that ir is the 
range space, and not the domain, of g which is R. The sets ^ '(f)= 
{x: ^-(x) = c} are disjoint for distinct c ^2 R. Since P(9C) is finite, it fol¬ 
lows that P(g~ l (c)) > 0 only for a countable set of values of c. Since 
g is bounded there is a bounded interval [a, b) with g(X) C [a, b). We 
can take a = x mi < ... < x m , km+ i = b fD with no x mk C D for 
k ^ k mf m = 1 ， 2, ... ， and max(jf m ,i + i — x mk ) — » 0 as w —» «. 

k 

Let/ m fc be indicators of the x m *+i), omit the empty J mA) 

setg mk = inf{^(x): a; C ]mk}, gmk = sup{^(x): A ； C ]mk\y and 

S'n = S = ^ mk- 

- k ~ k 

Since, by (vi), P„(] mk ) —» P(] m k) y it follows that, as » —» <», 

Jgm dP<r- J* g m dPn S jgm dP n Ju, 
while P(D 0 ) = 0, by hypothesis in (i), implies that, as w —» , P-a.e. 

Sm T ^ 1 

Therefore, letting n — ① then w —» a> in 


f gm dp n ^ Jg dp n ^ dP n> 

we obtain (i): 

jg^Pn^ jgdP. 

The proof is terminated. 


Corollary 1 . If P n —^ P then P n h~ l —» Ph~ l for every P-a.e. continu¬ 
ous h on 'X. to equivalently J*g(A) dP n —» J*g d{P n h~ l ) for all bounded 
continuous g on 9 C , to R. 
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For ， P Hh = 0 and {h~ l C) C {h~ l C) U Dh for every closed C, imply 
P{h~ l C) = P{hr l C) hence 

limsup P n (h~ l C) ^ limsup P n (h~ l C) ^ P(h~ l C) = P(h~ l C) and, by 

A(iv )， P n h~ l —» Ph-K The equivalence assertion results at once from the 
change of variable formula by A(ii). 

Corollary 2. If P n — P on 6 C S where Q is closed under finite 
intersections and each open set is a countable union of members of 6, then 

w 

Pn — P. 

00 

Proof. Let U = \J Ak y Ak €1 Q- By hypothesis, 

a*— i 

Pn{Ai U A%) 

=Pn(^l) + PJM - PniJM ^ PUl) + P ⑷一 P(JlJ2) 

=P{Ai VJ A<t) 

and, by induction, for every integer w, 

Pn{A, \J ...\J A m )^ P(A l \J ---yj A m ). 

m 

Since U m = U | U as m — ①， there is an w = m t such that 

Jb-l 

PU —• e ^ PU m . Therefore, 

PU — 4 PU m = \im P n U m ^ \imm(P n U 

n 

and, letting e | 0, 

liminf P„t/^ PU 
so that A{v) holds, and P„ 二 P. 

Corollary 3. Let 9C be separable and let P n — P on 6 C S- Then 
Pn^P if 

(i) 6 is closed under finite intersections and, given e > 0 and open U, 
for every x ^_U there is an // C 6 with x A° d A C. U. 

or 

(ii) 6 consists of those finite intersections of open spheres which are P- 
continuity sets. 

Proof, (i) : Since 9C is separable, given open t/，there is a sequence 
(y/„) in 6 with U = \J and A n C U so that U = \J A n andCorollary 
2 applies. n 
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Note that the second condition on Q in (i) is implied by: for every x 
and every r > 0 there is an y/ C 6 with 

x A° C. A <Z S x (r)-open r- sphere about x. 

(ii )： Since d{AB) QdJ^JdB while dS x (r) C {x: d(x y y) = r) has 
P-measure 0 except for countably many values of r, (i) applies, and the 
proof is terminated. 

*12.2 Regularity and tightness. Since the Borel field of the metric 
space 9C is generated by the class of open (of closed) sets, it is to be ex¬ 
pected that a pr. 尸 on S would be determined by its restriction to such 
a class. 

a. Regularity lemma. Every pr. P on % is regular', given A % 
and e > 0, there are open U t and closed C t such that 

C t Q A QU t and P(U t - C ( ) < e, 

equivalently ， 

PA = sup PC = in(PU. 

CCA UZ3A 

Proof . The equivalence assertion is immediate. To prove the e-asser¬ 
tion, let 6 C S be the subclass of those Borel sets for which the assertion 
holds. 

6 contains the class of closed sets C since open U r = {x: d(x, C) < r} I 
C as r I 0. It is clearly closed under complementations. Also it is 
closed under countable unions: Given A n C 6 and e > 0, there are 
C n C. A Q U n with P(U n — C„) < c/2 n+1 ; take U t = \JU n and C t = 

U c n with m such that P(UC» — C 4 ) < e/2, so that C t Q A C. U t 

n^m 

and P(U € ■— C t ) < e. Thus 6 C S is a <r-field containing the class of 
closed sets hence Q = S. 

Corollary. The set {Jg dP\ g bounded uniformly continuous) deter¬ 
mines P. 

For, the functions g m defined by g m (x) = e^ md( ^ XtC) are bounded and 
uniformly continuous with = 1 on C ^ndg m | 0 on C c as w —» ①， so 

that Jg m dP —» PC. 

The concept of “tightness” below was named by Le Cam in a memoir 
which followed within a year that of Prohorov and extended the whole 
theory to much more general topological spaces than the metric ones. 
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A family (P of pr.’s on S is said to be tight if for every e > 0 there is a 
compact K f such that PKe < e for all P G (P. We say that (P lives on a 
Borel set 9Co if P(9C。）= 1 for all P C equivalently, if PA = PAXq for 
every C S and for all P C (P- If (P = {P} is a singleton we replace 
above the family (P by the pr. P. Given a Borel set 9C 0 , the <r-field S 0 = 
{A: A (Z 9C 0 , ^ G §} is the Borel field of the metric space 9C。with its 
relative topology. Thus the above definitions apply to families of pr:s 
on So C S. 

b. Tightness lemma, (i) If a pr. P on $ is tight then it lives on a 
a -compact 9C 0 and PA = sup PKfor every / C S. 

K C A 

Gi) Converselyy if P on $ lives on a <x-comDact 9Co or if PA = sup PK 
for every ^ C S then P is tight. Kcz A 

(iii) Every pr. P on $ is tight when 9C is separable and complete. 

Proof. 1°. If P is tight then for every n there is a compact K n with 
PK n c < l/n y so that P(f|^n c ) = 0 and P lives on the <r-compact 9Co = 
[}K n . Note that 9Co is separable since compacts in metric spaces are 
separable. 

By a, P is regular so that, given y/ C S and e > 0, there is a closed 
C (Z. A with P(A — C) < e/2. But for n sufficiently large, P 尺 n c < e/2 
and K t = CK n is compact with K t (Z C (Z A. Since 

P{A - K t ) ^ P{A - C) + P(C - K t ) < e/2 + P(9C - K f ) < e 


and e > 0 is arbitrarily small, it follows that PA = sup PK, and (i) is 


proved. 


KC.A 


Conversely, if P lives on 9Co = U 尺 ", that is, P(U-^n) = 1 then, given 


e > 0, there is an m such that PK t c < e for compact = [J K n , and P 

n^m 

is tight. This proves the first assertion in (ii) and the second is immedi¬ 
ate. 

2°. When 9C is separable then, for every n y open l/»-spheres t/„i, 
U ni , - - - cover 9C. Therefore, given a pr. P on S and e > 0, for k n suffi- 
ciently large Pt/„ c < e/2 n+1 with t/„ = U U nk . When moreover 9C is com- 

plete then the closure K f of the totally bounded set U„ is compact. 
Since 

PK t ^ P(Ut/n c ) < L«/2" +1 = e 
P is tight and (iii) is proved. 

A. Tightness theorem. Let the family (P of pr.'s on S be tight. Then 
(i)- (P lives on a a-compact set 9Co and, PA = sup PK for every A ^2.%. 

KC.A 
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(ii) The family (9h~ l = \Ph~ 1 '. P C is tight for every continuous 
function h on X to a metric space 9C 、 

Proof . The proof of (i) is exactly the same as that of b(i )； it suffices 
to observe that the compacts K„ therein are the same for all P C. (9. 
Note that, in general, b(ii) does not hold for families (P. 

For (ii), given e > 0 there is a compact K t with PK t c < e for all P C (?■ 
Since A on 9C to 9C’ is continuous, K' t = h(K t ) is compact in 9C’ and 
A* C implies that for all P C 

Ph~KK' t y = P(h-^K：Y) = P(h-^K' t y g PK t c < e, 
and (Ph~ l is tight. The proof is terminated. 

Let So be the <r-field of Borel sets on a Borel set 9C 0 C 9C- Given a 
family (P° of pr/s on So, its extension to S is defined by 

(P = {P-.PA = P°(^9C 0 ),P° G (P 0 ,^ G 
note that P9C 0 = 1 for all P C. (?, 

Corollary, (i) If (P° on So is tight so is its extension (9 to % . 

W w 

(ii) " P n 。一》 P 0 on So then their extensions P„—^Pon S. 

For, upon taking h to be the (continuous) identity mapping of 9Co into 
9C, A(ii) yields (i) and 12.1 A Corollary 1 yields (ii). 

*12.3 Tightness and relative compactness. We say that a family (P of 
pr.’s on S is relatively compact if every sequence of members of (P con¬ 
tains a subsequence which converges weakly to a pr. on S. Thus ^rela¬ 
tive compactness” is, in fact, relative sequential complete compactness. 

Prohorov theorem below is the second basic theorem of this section. 

A. Relative compactness criterion. Let yi be a separable complete 
metric space. Then a family (P of prison its Borelfield S is relatively com¬ 
pact ij and only ij (P is tight. In fact，the “ ij” part holds Jor general metric 
spaces 9C. 

Proof. 1°. Let (P be relatively compact. Since 9C is separable for 
every r > 0, there are open r-spheres U 1} U iy - - - which cover 9C so that 
V n = U\ • • • (J n \ Given e > 0, there is an n such that PV n c < e 
for all P C (P ： Otherwise, for every n there is some P n C (P with P n V n ^ 
1 — e and, by relative compactness, the sequence (P„) contains a sub¬ 
sequence P n , 二 some pr. P on S; thus, by 12.1 A(v), for every n 
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PV n ^ liminf ^ liminf P n ， V n ，^ 1 - e, 

n> n , 

while PV n t 1 — contradiction. 

2°. For the “if” part we follow Billingsley who bypasses Prohorov’s 
use of integral representation of linear functionals by hewing closely to 
Halmos* generation of Borel measures from “content” to “inner con- 
ten t ，J to ‘‘outer extension.” The difference is that “content” is defined 
by Halmos on the class of all compacts while here the corresponding set 
function has the same properties but only on a subclass of compacts. 

For the time being, assume that 9C is separable so that it has a countable 
base of open spheres Ui, U 2} • • • ; include 9C in this base. 

Let (P be tight so that for every n there is a compact K{n) with 
P(K(n)) c <\Jn for all P (P. Let 3C consist of all finite unions of sets 
of the form U m K(n). Thus the class 3C is countable, closed under finite 
unions, and its members — to be denoted by K with or without affixes, 
are compact. 

Given a sequence (P„) of members of (P, Cantor’s diagonal procedure 

yields a subsequence P„> —» some X on 3C. We have to prove that P n > —» 
some pr. P on S. 

Let 

\oU = sup \K, X 0 // = inf \qU, 

KCV UZ3A ' 

so that 入 is defined on 5C, Xo on the class ^ of open sets t/，and 入 0 on the 
class of all subsets. We shall show that the restriction of X° to S is pre¬ 
cisely the pr. P. 

Clearly, X on 3C is nondecreasing, additive, and subadditive: Ki C 
尺 2 => \K, ^ \K h X(^ + K 2 ) = \K X + \K 2> X(^ U K 2 ) ^ \K, + \K„ 
Xo and X 0 are nondecreasing, and X 0 = Xo on We shall use these prop¬ 
erties without further comment. 

3°. Xo on ^ is a-subadditive: 

Let K Cl Ui U Ui and set 

C! = {x G /C ： d(x, W) ^ d(x } W)\ } 

c 2 = \xc K:d(x, w) ^ d( Xi tv)}. 

These closed sets, being contained in compact K f are compact and so are 
CiUi c and If x C C\Ui c 0 belongs to Ui, then d(x, Ui c )= 

0 < d(x } U 2 C ) hence x Ci-contradiction. Thus Ci C Ui and, by defi¬ 
nition of 3C, Ci d Ki d Ui for some Ki ； similarly Ci (Z Ki (Z 
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Therefore, upon taking the sup remum in 尺 in 

入尺 S MKx U K 2 ) ^ \Ki + \K 2 ^ XoUx + Xot/ 2 , 

we obtain Xo (t/i U U 2 ) ^ Xot^i + \ 。 £7 2 ， so that X。 on M is subadditive 
and, by induction, is finitely subadditive. Now, if K C U then, by 
compactness, K d = U U n for some m. Therefore, upon taking 

n^m 

the supremum in K in 

入尺 S ^ Y. Xot/n ^ z x 0 u„) 

m^n n 

we obtain X 0 (Ut/ n ) ^ X ^oU n so that Xo on^ is <r-subadditive. 

n 

For closed C and open U，\ 0 U ^ \ 0 UC + X 0 t/C c : Given e > 0 y there 
is a i^i C UC c with 入尺 1 > \ Q UC c —• e/2, and then there is a 尺 2 C UK X C 
with 入尺 2 > 入 f7Xi c — e/2. Since Ki and K% are disjoint and contained 
in U y 

入 of/ 2 入 (Xl + D = 入尺 1 + 入尺 2 > 入 0(^^C c ) 

+ 入 0 ( 呢） 一 6 g X 0 (f/C c ) + \ Q (UC) - e 

hence，letting e —» 0, the assertion is proved. 

4°. X 0 is an outer measure and Borel sets are ^-measurable\ 

Given e > 0 and C 9C there are U n D A n with \oU n < X°^ n + 
e/2 n+1 . Since Xo is <r-subadditive, 

X°(U^„) ^ Xo (Ut/„) ^ E \ 0 u n < Y, X°^n + C 

n n n n 

so that, letting e —» 0, X° is <r-subadditive. Since X° is also nondecreasing, 
X° is an outer measure. Furthermore, for closed C and open U 〕 
upon taking the infimum in U in 

XoU'^ X 0 t/C+ UUC c ^ \\AC) + \°(JC c ) } 

we obtain 

^ \\AC) + X°(^C c ), 

so that closed sets are 入 0 -measurable. Therefore, the Borel field S (that 
the class of closed sets generates) is contained in the <r-field of 入 0 -measur- 
able sets. 

5°. Let P be the restriction of 入 0 to S, so that P on S is a measure; 
in fact, P is a pr. since 

1 ^ P9C = X 0 9C = sup \(K(n)) ^ sup ^1 — 
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Since for all open U 

PU = \oU = sup \K y 

KC.V ' 

upon taking the supremum in IC C U in 

\K = lim P n ，K ^ YimmfPnrU, 

n f n f 

we obtain 

PU ^ liminfP n ,^ 

Thus, by 12.1 A(v), 尸„, 二 P and the “if” part is proved but under the re¬ 
striction of separability of 9C. 

Now, let 9C be a general metric space. By 12.2 A, (P on S, being tight, 
lives on a <r-compact9Co — a separable metric space in its relative topology. 
Thus what precedes applies to the restriction (P° of (P to the Borel field 

So of 9C 0 . But, by 12.2A Corollary (ii), P„»°—» P° on So implies P„> —» P 
on S. The proof is terminated. 

Corollary. Let 9C be separable and complete. Then (9 on % is rela¬ 
tively compact if and only if y for every e > 0 and r > 0, there is a finite 
union V n of r-open spheres with PV n c < e. 

§ 13. CHARACTERISTIC FUNCTIONS AND DISTRIBUTION FUNCTIONS 

Pr. properties are properties describable in terms of distributions — 
and those are set functions. The introduction of d.f.’s makes it pos¬ 
sible to describe pr. properties in terms of point functions, easier to 
handle with the tools of classical analysis. Yet, to a distribution corre¬ 
sponds not a single d.f. F but the family of all functions F c where c 
is an arbitrary constant. The selection of one of them is somewhat 
arbitrary, and we have constantly to bear this fact in mind. The in¬ 
troduction of characteristic functions (ch.f.) assigned to the family 
•F + f by the relation 

/(«) = f e iux dF(x) y uCR 

obviates this- difficulty and, moreover, is of the greatest practical im¬ 
portance for the following reasons. 

1° To the family F c corresponds a unique ch.f., and conversely. 
Therefore, there is a one-to-one correspondence between distributions 
and ch.f/s. 
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2° The methods and results of classical analysis are particularly 
well suited to the handling of ch.f/s. In fact, ch.f/s are continuous 
and uniformly bounded (by 1) functions. Moreover, to complete and 
weak convergence of d.f.’s (defined up to additive constants) corre¬ 
spond, respectively, ordinary convergence of ch.f/s and ordinary con¬ 
vergence of their indefinite integrals. 

3° The oldest and, until recent years, almost the only general 
problem of pr. theory is the “Central Limit Problem,” concerned with 
the asymptotic behavior of d.f/s of sequences of sums of independent 
r.v/s. Much of Part III will be devoted to this problem- The d.f.’s 
of such sums are obtained by “composition” of the d.f’s of their sum¬ 
mands, and this “composition” involves repeated integrations and re¬ 
sults in unwieldly expressions, whereas the ch.f/s of these sums are 
simply the products of the ch,f/s of the summands. The Central Limit 
Problem was satisfactorily solved in the 15 years (1925 - 1940) which 
followed the establishment by P, Levy of the properties of ch.f/s* 

13.1 Uniqueness. The characteristic junction {ch.f.) / of a d.f. F is 
defined on R by 


/ ⑻ =I dF{x) = I cos ux dF{x) + i \ sin ux dF{x) y u G R. 


Since, for every u [ R, the function of x with values e xux is continuous 
and bounded by 1,/ exists and is continuous and bounded by 1 on R. 
Moreover, to all functions F c y where c is an arbitrary constant, cor¬ 
responds the same fuhction /. The converse (and, thus, the one-to- 
one correspondence between distributions and ch.f/s) follows from the 
formula below. 


A, Inversion formula, 
F[a y b) = lim 


► +U e—iua 


e 


— tub 


U- 


2ir 


•u 


iu 


-/(«) du. 


provided a < b are continuity points of F. 

The inversion formula holds for all a < b R, provided F is normalized. 

We say that F is normalized, if the values of F at its discontinuity points 
F(x — 0) + F(x + 0) 


x are taken to be 


2 


Normalization destroys the 


continuity from the left of F at its discontinuity points. However, 
according to 11.1, the normalized d.f. determines the original one, so 
that nothing is lost by normalization. 
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We observe that, in the integral which figures on the right-hand side 
of the inversion formula, the integrand is defined at u = 0 by continuity, 
so that it is continuous on i?; also it is bounded on R by its value {b — a) 
/(0) at « = 0. Thus, for every finite U } this integral is an ordinary 
Riemann integral and, in proving the inversion formula, we shall find 
that the limit of this integral, as t/ — > <», exists. 

Proof. The proof uses repeatedly the dominated convergence theo¬ 
rem applied to an interchange of integrations and is based on the classi¬ 


cal Dirichlet formula 


1 r b sin v 
v v 


dv —* \ as a 


■OOj b —> +00, 


so that the left-hand side is bounded uniformly in a and b. Let 



+ U ^ 一 iua _ e—iub 


/(«) du, a < b Ry 


and replace f{u) by its defining integral J*e lux dF{x). We can inter¬ 
change the integrations, so that, by elementary computations, 

Iu = f Ju{x) dF{x), 


where 


Ju(x) = _ f 

V Jr 


U (x 一 a) 


sin v 


TJu(x^b) V 

Since Ju is bounded uniformly in U, integration and passage to the 
limit as t/ — > «5 can be interchanged in 


Therefore 


where 


lim /(/ = lim f Ju(x) dF{x). 

U — > °o C/ —► •/ 

lim Iu = f Ji x ) dF{x) 
u -*« J 

*1 for a < x < b 

J{x) = lim Ju{^) = h for x = a y x 
u 一 ► 00 


10 for x < a y x > by 

and, hence, 

lim Iu = ^\F{a + 0) - F(a - 0)} + \F{b - 0) - F(« + 0)} 

—+KW + 0) - F{b - 0)} 

F{b - 0) + F{b + 0) F(a - 0) + F(a + 0) 

= '■ ■ . . . — --- " — . . 

2 2 
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Thus, if F is normalized or if a < ^ C C(F), then 

lim Iu = F[a, b) y 
u 一 ► « 

and the inversion formula is proved. 

Remark. If an improper Riemann integral 


»+oo 


gdx — lim I gdx 


a > 
6 - 


exists and is finite, then 


，+ u 


lim 


.+00 


XJ- 


gdx 


-u 


gdx. 


However, the left-hand side limit may exist and be finite (as in the in¬ 
version formula), whereas the right-hand side improper integral does 
not exist. Yet the inversion formula can be written in terms of an im¬ 
proper Riemann integral as follows: 


F[a, b)=- 

TT 



tb V(u)} 


du 


u 


where 3 stands for “imaginary part of,” so that 
3\{e~ iua - e~ iub )f{u)\ = 


(cos ua — cos ub)Sf{ti) — (sin ua — sin ub)(Rf{u). 


It suffices to 


write 



u 

,change u into 


—« in the first 


right-hand side integral, and take into account the fact that then the 
integrand changes into its complex-conjugate. 

Corollary. F is differentiable at a and its derivative F'{a) at a is 
given by 

1 r +u 1 —亡 一 iuh 

(1) F'{a) = lim lim — I - : - e~ iua f{u) du 

h—*o ux 2ir J—u iuh 

if, and only if, the right-hand side exists. 

In particular, if J is absolutely integrable on R, then F' exists and is 
bounded and continuous on R and, for every x C. R, 

⑵ F f (x) =^- f e— iux _ du. 

2ir J 〜 
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Proof. The first assertion follows directly from the inversion formula 
by the definition of the derivative. The second assertion follows from 

the first and from the assumption that^J* \j\du < oo since, the integrand 

in (1) being bounded by |/| , we have, in (1), 

/ 4-C7 p +oo 广 +oo 广 

=I and lim I = I lim. 

■ u ^—oo h -* 0 •/ _ x j —oo h — Q 

Remark. Thus, if the ch.f/s f n of d.f.’s F n are uniformly Lebesgue- 
integrable on R, and if/ n / ch.f. of F, then / is Lebesgue-integrable 
on R } and F' n —> F'. 

B. For every x C. R, 

• i r +u _•, 

F(x + 0) — F(x — 0) = lim — I e wx f{u) du. 

u 2U J—u 


For we can interchange below the integrations and the passage to the 
limit, so that 


lim — 
t/*—►« 2C/< 


>+u 


■u 


%x f{u) du 


lim - 
r/ —► « 2 C/ < 




iu (y 一 x) 


dF{y) 


F(x + 0) — F(x — 0). 


sin U(y — x) 


13.2 Convergences. Since there is a one-to-one correspondence be¬ 
tween d.f.’s defined up to additive constants and ch.f.’s, it has to be 
expected that a one-to-one correspondence also exists between the weak 
and complete convergence, up to additive constants, of sequences of 
d.f.’s and certain types of convergence — to be found—of ch.f.’s. For 
this purpose we introduce the integral ch.f. J of F defined on R by 


/(«) 



e iux 



ix 


dF{x). 


The last integral is obtained upon replacing/(y) by its defining integral 
and noting that the interchange of integrations is permissible. Since 
there is a one-to-one correspondence between /and its continuous deriva¬ 
tive /, it follows, by 12.1, that there is a one-to-one correspondence 
between / and F defined up to an additive constant. 
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We are now in a position to show that the weak and the complete 
convergence up to additive constants of sequences of d.f.’s correspond 
to the ordinary convergence of the corresponding sequences of integral 
ch.f.’s and of ch.f/s, respectively. Unless otherwise stated, a d.f., its 
ch.f” and its integral ch.f. will be denoted by F, /, J respectively, with 
the same affixes if any. 

A. Weak convergence criterion. 1/ F n F up to additive con¬ 
stants ^ then f n —> /. Conversely^ ij J n converges to some function then 

there exists a d.f. F with F n —> F up to additive constants and / = 左 . 



Proof. Since - > 0 as ^ ► 干 <»， the first assertion follows at 

ix 

once, by the extended Helly-Bray lemma, from the definition of the 
integral ch.f.’s. 

Conversely, let J n —> According to the weak compactness the¬ 

orem, there is a d.f. F and a subsequence F n > —> i 7 as » r —> «j. There¬ 
fore, by the extended Helly-Bray lemma, for every u 〔 R, 

r e iux _ j r e iux _ j 

g(u) = hmf n >(u) = hm I — ; - dF n >{x) = I -:- dF{x) =/(«). 

n' n' J tX J tX 

Since J determines F up to an additive constant, it follows that weakly 
convergent subsequences of the sequence F n have the same limit F up to 
additive constants, with / = ^. This proves the second assertion. 

Corollary 1. Every sequence /„ of integral ch./.'s is compact in the 
sense of ordinary convergence on R. 

For, in view of the above criterion, this statement is equivalent to 
the weak compactness theorem for d.f/s. 

Corollary 2. IJ j n — g a.e., then F n 二 F up to additive constants 、 
with j = g a.e. 

Here “a.e.” is taken with respect to the Lebesgue measure on R. 

Proof. Since f n g a.e. and the f n are continuous and uniformly- 
bounded by 1, it follows that g is measurable and bounded a.e. so that, 
by the dominated convergence theorem, f n —* i where i is defined on 
R by the Lebesgue integral 

i(u) = I g(v) dv y u C R. 

J 。 
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Therefore, by the foregoing criterion, F n 二 i 7 up to additive constants, 
and / = Since the derivative of j is/, whereas that of the indefinite 
Lebesgue integral g exists and equals g a.e” it follows that f = g a.e. 

B. Complete convergence criterion. If F n 二 F up to additive 
constants、then f n —>/. Conversely^ if f n —> g continuous at u = 0, then 

C • • 

F n — F up to additive constants y and/ = g. 

When the F n and f n are d.f/s and ch.f/s of r.v/s, the converse becomes 
the celebrated P. Levy’s continuity theorem for ch.f/s. 

c 

Proof. Let F n — F up to additive constants. Then, by the Helly- 
Bray theorem, for every u ^2 R y 

e iux dF n {x) Je iux dFW = /(«)• 

Conversely, let f n g continuous at « = 0. Then, for every u ^2 R y 

/•U pU 

/»(«) = I fn{v) dv -> I g{v) dv = g(u)y 

and, hence, by the weak convergence criterion, for some d.f. i 7 with ch.f.f, 
F n F up to additive constants, and f = i- Therefore, 



and, letting « —> 0, we obtain /(0) = ^(0) on account of continuity of 
/ and of g at the origin. Thus, 

VarF n =/« ⑼ —g ⑼ =/(0) = VarF, 
and the proof is completed by taking into account the direct assertion. 

C. Uniform convergence theorem. If a sequence f n of ch./.'s con¬ 
verges to a ch.f. /, then the convergence is uniform on every finite interval 
[-U, + 17 ]. ' " ^ 

C e o 

Proof. On account of B, F n —> i 7 up to additive constants. 

Let € > 0 and C/ > 0 be arbitrarily fixed. We have 

\/n(u) -f(u)\^\ f b e^ x dF n ( X ) - f b e iux dF(x)\ 

+ VarF n — F n [a y b) + Var F — F[a y b) 
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where we take a, b to be continuity points of F. Let | a |， 彡 and then 

n be so large that Var F — F[a, b) | < - , 

6 

VarFn - F n [a, b) < VarF - F[a, b) 

6 3 

It suffices to show that, for n sufficiently large and all u Q[—U, + U], 
An = I f e iux dF n (x) - f e iux dF{x) | <-• 

J a 2 

Let 

a = X\ <i X2y • • • < *Vat+i =; 

where the subdivision points are continuity points of F and 
a = max (.v^+i — ^k) < ^/SU. Since, by the mean value theorem, 

kSN 

I 产一户 ， I ^ I ^ _ ^|[7 for \u\^ U, 

it follows that, upon replacing x by Xk in every interval [xk y A n 

is modified by at most 


aUj"dF n (x) 


+ aU^ dF{x) ^ 


2at7< ? 


Thus, it remains to show that, for n sufficiently large, 

k~\ 

^ I 方 *+1) - ^A+l) I 〈 ：• 

k 篇 1 ^ 

Since F n [x k ， x k+1 ) —> F[x k ， x k+ {) for every k ^ N } the last assertion 
follows and the proof is complete. 

Remark. In fact, we proved, with a supplementary detail, the first 
assertion of the complete convergence criterion without using the Helly- 
Bray theorem. 

Corollary 1. i// n —> f and u n —> u finite，then f n ( u n ) —> /(«)• 
This follows, by C and continuity of/, from 

|/n(«n) - /(«) I ^ |/n(«n) - /(«n) | + |/(«n) - /(«) I. 
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Corollary 2. A set oj d.f.’s is completely compact {up to addi¬ 
tive constants) if, and only if, the corresponding set \f t \ of ch.f.’s is equi- 
continuous at u = 0. 

Proof. By 13.4B equicontinuity of \ft) at « = 0 is equivalent to 
equicontinuity on R. 

On the other hand, Ascoli’s theorem and its converse say that a set 
of continuous functions is compact in the sense of uniform convergence 
on a finite closed interval if, and only if, it is uniformly bounded and 
equicontinuous on this interval. Since the ft are uniformly bounded, 
the assertion follows by B and C. 

Remark. If the d.f/s F n , Fof r.v/s, are differentiable and F n ' —> F' 
on R, then f n —> / uniformly on R. It suffices to use 17 in Complements 
and Details of Ch. II. 


13.3 Composition of d.f.’s and multiplication of ch.f.’s. A function 
F on R = ( — 00 , +oo) is said to be composed of d.f/s Fi and F 2 , and 
written Fi * F 2 -, if 

F(x) =JFi(x - y) dF 2 (j), x C R 

where we assume, for simplicity, that Fi (一 °°) = F 2 ( — <x>) = 0; other¬ 
wise, to avoid trivial complications, we would have to replace F\ by 
F\ 一 一 <=o). 

Since, for every fixed y, F\(x — y) are values of a d.f” nondecreasing, 
continuous from the left and bounded by —oo) =0 and F^+oo) ^ 1 } 
it follows, upon applying the dominated convergence theorem, that F 
has the same properties and that Var F = Var • Var 

A. Composition theorem. 1/ F = Fi * F 2y then f = fif 2 , and con¬ 
versely. 


Proof. Let F = Fi*F 2 and let a = x ni < = b with 

sup (x n> k+i — x n k) — > 0 as » —> oo. Since, for every u C. R, 



dF{x) = lim E e iuXnk F\x nk , x n , k+1 ) 

k 



Z e iu， 


(Xnk-y) 


Fi\x nk - y, Xn,k +1 - y)e iuv dF 2 {y), 






so that/ = / 1/2 and the first assertion is proved. 

Conversely, according to the first assertion,/ 1/2 is the ch.f. of F\ * F 2 
and, hence, on account of the one-to-one correspondence between f and 
F c } F = Fi * F 2 up to an additive constant. The converse is proved. 


Corollary 1. A product of ch.j's is a ch.f. and t in particular y if f is 
a ch.f. so is |/| 2 . 

For/ = j\ji is the ch.f. of the d.f. F = Fi* F 2 , and the particular case 
follows from the fact that, if/ is a ch.f., so is its complex-conjugate / 
which corresponds to the d.f. i^+oo) — F( — x + 0). 


Corollary 2. Composition of dj.’s is commutative and associative. 

For the corresponding multiplication of ch.f.’s has these properties. 

13.4 Elementary properties of ch.f/s and first applications. In the 
sequel, the elementary properties we establish now will play an impor¬ 
tant ancillary role, and the first applications will be used, improved, 
and generalized. 

We denote by F and /, with same subscripts if any, corresponding 
d.f.’s and ch.f.’s; in general, the corresponding d.f/s F are defined up to 
additive constants, but if / is ch.f. of a r.v” then, as usual, we take 
F( — oo) = 0, F(+oo) = 1. We say that a r.v. X is symmetric if X and 
— X have the same d.f., that Is, for every x C. R, P[X < ^] = P[X > 

A. General properties. Every ch.f. f is uniformly continuous and 

I/I ^/(0) = VarF ^ 1, /(-«) =7W- 

If f is the ch.f. of a r.v. X, then the junction with values e tua f{bu) is the 
ch.f. of the r.v. a bX. In particular , / is the ch.f. of —X andf is real 
if, and only if, X is symmetric. 


Elementary inequality : /(0) — R/(2u) ^ 4(/(0) — Rf{u)). 


Proof. The first assertion follows from f{u) = J*e lux 


ond assertion follows from £ e lu ^ a + hX ) = e lua Ee ibuX . 


dF{x). The sec- 
Finally, if X is 
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symmetric, then/(«) = Ee iuX = Ee - iuX = /(-«) =J(u) so that / is 
real; conversely, if / is real, then changing the signs of a and b in the 
inversion formula is equivalent to taking the complex-conjugate of the 
integrand and changing its sign, so that F[a, b) = F[-b y -a) and, 
hence, by letting a —> -oo and b^x, we have P[X < = F{ X ) = 1 

- F(-at + 0) = P[X> -x]. 

The elementary inequality obtains upon integrating 1 — cos 2ux ^ 
4(1 — cos ux) with respect to F. 


B. Increments inequality: for any h R 

|/(«) - /(« + h)\ 2 ^ 2/(0) {/(0) - (R/(h)}. 

Integral inequality ： for u > 0 there exist functions 0 < m{u) < 
M(u) < oo such that 


r u n x 2 


^ M{u) {/(0) - (Rf{v)\ dv; 


f/(0) = 1 ， then y for u sufficiently close to 0, 




1 + x 2 


dF{x) ^ —M ⑷ I (log (R/(y)) dv. 


Proof. The increments inequality follows, by Schwarz's inequality, 
from 

n |2 


l/(«) -Au + h) I 2 




I e iux {\ - e ihx ) dF(x) 

JdF{x) J| 1 - e ihx I 2 dF(x) 

= 2/(0)^ (1 — cos hx) dF{x) 

= 2/(0){/(0) - (R/(h)}. 

The integral inequality follows, by the elementary inequality with 

« 〆 0 

/ sin ux\ 1 + 


0 <M^\u) $ I “ 丨 
from 


c\ 1 + 

7~? 


^ m ~ l {u) < oo, x C R, 


O - cos ㈣ 州 =«J(l- !^) i±^!.- 


dF{x). 
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The case /(0) = 1 follows then from the elementary inequality l — a 
^ — log a (or a ^ 0. 

The integral inequality permits us in turn to find bounds for 
f x 2 dF{x) and f dF{x) y (r > 0), by 

J\x\<c J\z\^c 

(I) — f P dF{x) + -f~ 2 f dF{x) 

1 -f- r J\x\<c 1 -f- c 2 j\x\^c 

n V 2 

^ f / dF{x) + f dF(x). 
J\ x\<c J\ x\^c 

However, it is sometimes more convenient to use the direct 
B'. Truncation inequality: for u > 0 ： 

f x 2 ^F(x) ~(R/(u)} } 

J\X\ <\/u « 2 


dF{ X ) ^ - |/(0) - (Rf{v)\ dv. 

J\ z is i/m u Jo 

If /(0) = 1 and u is sufficiently close to 0, then we can replace 1 — (R/ in 
the foregoing by — log (R/. 

These inequalities follow, respectively, from 

r r u 2 x 2 / u 2 x 2 \ 

一 cos «,) dF{ X ) - i ) 翊 


and from 


—r / dF{ X ) 

24 J \ x \< i/u 


- Cdv f(l - cos vx) dF{x) = f(l - dF{x) 

u J 0 J J \ ux / 

^ (1 — sin 1) ^ dF{x). 
J\ x\^ 1/u 


The case/(0) = 1 follows as in B. 
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Applications. 1° If f n g continuous at u — 0 y then g is continuous 

on R. 

This follows from the fact that the increments inequality with f n be¬ 
comes, as » —> oo, the same inequality with g. 

2° If the sequence f n is equicontinuous at u — then it is equicon- 
tinuous at every u R. 

For, then, as ^ > 0, 

|/n(«) ~ fn{u + 々） | 2 刍 2{/ n ⑼一 ( Sifn { h )} —> 0 
uniformly in n. 

3° 7//n —> \ on {— U, - \-U), then f n ^\on R. 

This follows by induction as / n (2«) —> 1 for | « | < C7 follows from 

l/n ⑷ -/n(2«) I 2 S 2{/n ⑼- (R/ n («)} ^ 0 for \ u \ < U. 

If we take into account the fact that the set of all differences of num¬ 
bers belonging to a set of positive Lebesgue measure contains a non¬ 
degenerate interval (— U y + U) y this proposition can be improved as 
follows: 

If fn ^ \ on a set A of positive Lebesgue measure 、 then j n ^ \ on R. 

For, we can assume that the set A is symmetric with respect to the 
origin and contains it, since, for u A^ 

/n(—«) = /n(«) —^ lj 1 ¥/n ⑼ 2 |/n (“ ）| — 1 ， 

and, then,/ n (« — u’ ）一 > 1 for u' A on account of 

|/n(«) 一 /»(« — 《') | 2 S 2{/ n ⑼一 （ R/n ( — ^ 0. 

4° We shall now prove an elegant proposition (slightly completed) 
due to Kawata and Ugakawa. We use repeatedly Corollary 2 of the 
weak convergence criterion which says that, if a sequence of ch.f.’s 

W 

gn g a.e,, then the corresponding sequence of d.f.’s G n —> G up to 
additive constants and the ch.f. of G coincides a.e. with g. 

n • w 

Let g n = Yl/k g a.e. Either 兄 = 0 a.e. y and then G n 一 ^ 0 up to 

k— 1 

additive constants. Or g 9 ^ 0 on a set A of positive Lebesgue measure ， 

c .. 

and then G n ^ G up to additive constants. 
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Proof. In both cases G n —> G up to additive constants. The first 
case follows from the recalled proposition. In the second case, we have 
to prove that Var G n —> Var G. Since Var F s — Var (F* F) = (VarF ) 2 
and /* = |/| 2 , it suffices to consider real-valued nonnegative ch.f.’s. 

m 

But then lim JJ /* exists on R and coincides a.e. with a ch.f., while, 

m 一 ^ n +l 

for m 9 n sufficiently large, g m g n ^ 0 a.e. on A y and, as m > oo and then 
n 

m oo 

H fk = gm/gn ->• g/gn = II /* — 1 a.e. on 乂 
00 

It follows, by 3°, that JJ /* — 1 a . e . on 兄 Therefore, if H n is the 

d.f. whose ch.f. coincides a.e. with H /*> then Var H n —> 1. But, by 

11.2a and the composition theorem 13.3A, 

lim inf Var G n ^ Var G = Var G n * Var H n . 

It follows, by letting n —* that Var G n —> Var G. The proof is 
completed. 


5° Let Fnk be d.f.’s of r.v.’s, 走 = 1 ， … ， 走 n — °°, 7n = IZ (1 —/«*)• 

k 


Set = E 


• x 


- - r dF nk {y) and a(c)_= sup L I, , ^F nk (x), 

1 + y Z n k J\x\-Z:c 


/3(f) = sup E I, , x2 ^F nk (x), C > 0 finite. 

n k ^ I x I <c 

Iff n = Ylfnk 切 ithf n kreal-valued，then the following properties are equiv¬ 
alent'. 


(Ci) the sequence F n is completely compact. 

(O the sequence y n is equicontinuous at u = 0. 

(C 3 ) a(c) 0 aj c —> 00 and a(c) 4 - ^(c) < <» for every {some) c. 

(C 4 ) the sequence is bounded and completely compact. 

Proof. (Ci) <=> (C 2 ) by 13.2 C Cor. 2 and the inequality 1 — ^ 

11(1 — ak) ^ exp {— 0 < ^ < 1. (C 2 ) => (C 3 ) by and (C 3 ) 

=> (C 2 ) by 7 n(«) ^ 2a(c) + ^(c)u 2 /2. Finally, (C 3 ) <=> (C 4 ) and “some 

o “every by (I) ， a(c)c 2 /(l + ^ 2 ) < f d^ n {x) < a(c) and 

J Ixl^e 

11.2B. 
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m … =JV dF{x), M (r) = J| ^| r dF{x), 走 = 0, 1 ， 2, • • .， r ^ 0, 

be, respectively, the 走 th moments and the rth absolute moments of F. 
Let /(*) be the 走 th derivative of /(/⑼ =/) and, as usual, let 9, d' be 
quantities with modulus bounded by 1. 

C. Differentiability properties. If / (2n) (0) exists and is finite ， 
then M (r) < 00 for r ^ 2n. 

If /i (n+S) < oo for a 5 ^ 0, then for every k ^ n 


>(«) = ，呼以⑷， u d 


and / (A) is continuous and bounded by 〆*)，• moreover 


/(«) = Z + Pn(«), uCR 

where 

Pn(u) = « n f 1 - dt = o(u n ) = 0M (n) 畔， 

J 0 (» — 1)! n\ n\ 

and //0 < 5 ^ 1, then 

( ； u) n I u l n+s 

Pn («) = - — - r 

n\ (1 + 5)(2+5) •••(» +5) 

Proof. To begin with, we observe that, since | x | r, $ 1 + | x | r for 
r' < r, finiteness of p (r ) implies that of 

The first assertion follows from the existence and finiteness of the 
2»th symmetric derivative by using the Fatou-Lebesgue theorem in 

|/( 2 ") ⑼ I = lim ^ J x 2n dF{x). 


The second assertion follows from the fact that, by differentiating 

dF{x) k times under the integral sign, the integral so obtained is 

absolutely convergent and, hence, this differentiation and the integration 
can be interchanged. 

The limited expansions follow by integrating the limited expansions 
of e xux with corresponding forms of its remainder term. The last and 
less usual corresponding form of its remainder is obtained upon observ- 
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ing that | — 1 | ^ 2 | a/1 1 5 (since, for 0 < 5 ^ 1, if | a/2 \ < 1, then 

\e ia -l\^\a\^2\a/2\ & and, if \a/l\^ 1 , then | ^ - 1 |' ^ 2 ^ 
2 I a /2 | 5 ), and using successive integrations by parts in 

f 1 T^ ^ , 1 ^ itUX - 1 ) ^ ^ 2 1 - 5 ! ux\ s f — /)W_1 t s dt 
(”一 1 )! Jo {n — \)\ 

_ 2 1_5 | ux I 5 

(1 + 5)(2 + 5) • • • {n + 5) 

Corollary. If all moments of F exist and are finite 、 then / (A) (0)= 
i k m^ k) for every k, and 

' 00 , 、 {iu) n 

/(«) = E 历 (n) ^4- 
n=0 

in the interval of convergence of the series. 

Applications. We consider d.f/s F and ch.f.’s / of r.v/s X y with the 
same subscripts if any. If w (1 ) = EX = 0, we write <r 2 instead of 
= EX 2 . 

1° Normal distribution. A “reduced normal” d.f. is defined by 
F'{x) = e~ x，t ^ 2 /y/lir. It is the d.f. of a r.v” since w (0) = 1 by 

( 去 = iff ， 油 2 ㈣ 


2ir / »oo 




e~ p /2 p dp 


Since F\—x) = F^x), it follows at once that the odd moments vanish, 
while, by integration by parts, we obtain 


,(2n) = (2„ - l) w ( 2 n- 2 ) 


{In) \/2 n n\. 


Therefore, by the foregoing corollary, the "reduced normal” ch.f. is 


n=0 


e^ u /2 , uCR. 


2° Bounded Liapounov theorem. Let | X n | ^ f < 00 and 
EXn = 0. 

n n 

If s n 2 = H oh —>• then n/jfcC^An) —>• e 一雜 for every u C R. 
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Since £| X n | 3 ^ cEX n 2 and <r n 2 = EX n 2 ^ c 2 , it follows, upon fixing 
u arbitrarily, that 


, u 
fk\~ 

V*fn 少 


2 2 

2 , n I I r 


uniformly \n k ^ n. Therefore, for n sufficiently large, 


I ： log/* (-) = - ^ (1 

k 篇 1 2 


。⑴） + 沒 n 


n 


(1 + 0 ( 1 )) 


tr 

~2 


and the assertion is proved. 


§ 14. PROBABILITY LAWS AND TYPES OF LAWS 

14.1 Laws and types; the degenerate type. Since there is a one-to- 
one correspondence between distributions, d.f.’s defined up to an additive 
constant, and ch.f/s, they are different but equivalent “representations” 
of the same mathematical concept which we shall call pr. law or, simply, 
law. Moreover, to a given distribution on the Borel field (B we can 
always make correspond the finite part of a measurable function X on 
some pr. space (fl, Ct, P )， and the restriction of P to ^f -1 ((B) with 
values P[X C •S']* •S' C ®, is still another representation of the law defined 
by the given distribution; there are many such measurable functions 
and many such spaces. Nevertheless, the various representations of a 
given law have their own intuitive value. Thus, for every law we have 
a multiplicity of representations and we shall use them according to 
convenience. 

A law will be denoted by the symbol £, with the same affixes if any 
as the d.f. or the ch.f. which represents this law, and the terminology 
and notations for operations on laws will be those introduced for d.f.’s; 

in particular, if F n —> F we write £ n —> £, and if F n —^ F we write 

C 

£ n —> £. The case of laws of r.v.’s (with d.f.’s of variation 1 ) is by far 
the most important. The law of a r.v. X will be denoted by £(X), and 
if a sequence £,(X n ) of laws of r.v.’s converges completely — necessarily 
to the law £(X) of a r.v. X — we shall drop “complete” and write 
£,(X n ) —> £.(X). From now on a law will be law of a r.v. y unless otherwise 
stated. 

The origin and the scale of values of measured quantities, say a r.v. 
X, are more or less arbitrarily chosen. By modifying them we modify 
linearly the results of measurements, that is, we replace X by a -bX 
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where a and ^ > 0 are finite numbers. If, moreover, the orientation of 
values can be modified, then the only restriction on the finite numbers a 
and b is that 彡〆 O. This leads us to assign to a law St{X) the family 
3(-^) = {<£(^ + bX)) of all laws obtainable by changes of origin, scale, 
and orientation, to be called a type of laws. If b is restricted either to 
positive or to negative values, the corresponding families of laws will 
be called positive ， resp. negative types of laws. 

Letting ^ > 0 we encounter a boundary case — the simplest and at 

the same time the everywhere pervading degenerate type \£(a)} of laws 
of r.v.’s which degenerate at some arbitrary but finite value a y that is, 
such that X ^ a a,s. The corresponding family of “degenerate” d.f/s 
is that of d,f/s with one, and only one, point of increase a ^2 R with 
F{a + 0) — F{a — 0) = 1. The corresponding family of “degenerate” 
ch.f/s is that of all ch.f/s of the form /(«) = e iua y u C, R y so that their 
moduli reduce to 1. The converse is also true and, more precisely, 


a. A chj. is degenerate if y and only if 、 its modulus equals 1 for two 
values h 0 and ah 9^ 0 of the argument whose ratio a is irrational. In 
particular y a chj. f is degenerate if\f{u) \ \ in a nondegenerate interval. 

Proof. Since \f{h) \ = 1, there is a finite number a such that f{h )= 
e lha and, hence, 


Thus 


- iha f{h) =Je ih 


(x—a) 


dF{x) = 1. 



— cos h{x — a)) dF{x) = 0 


and, since the integrand is nonnegative, it follows that, for points ,v of 


increase of F y cos h(x 一 a) = 1 so that x f 


x n is a multiple of 孕 when 

h 

the points of increase x f y x" are distinct. Replacing h by ah y we find 

that x' — x" is also a multiple of —, which is impossible when a is ir- 

1 an 

rational unless there is only one point of increase. The particular case 
follows. 

Remark. The foregoing argument proves that, if \f{h) | = 1 for an 

00 • 00 

h 7 ^ 0y then /(«) = X u C. Ry where pk ^ 0> Upk — \ and 

k^O kmm ° 

•V* = a + 走 • 一 ; the converse is immediate. 
h 
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14.2 Convergence of types. If £,{X n ) £(X), then, for every a, 

彡 〆0, + bX n ) —>£(« + bX 、， since/ n —> / implies that e lua f n {bu) —> 

( 彡 “)， u [ r. Thus, we may say that convergence of sequences of 
laws to a law is, in fact, convergence of sequences of types to a type. 
It may even happen that, given a sequence £(X n ) convergent or not, we 
can proceed to changes of origin and of scale varying with n and giving 
rise to a convergent sequence £>{a n + b n X n ). In the particular case 
of consecutive sums X n of “independent” r.v.’s，a special form of the 
problem of finding the sequences of laws which converge for given 
changes of origin and of scale is the oldest and, until recently, was the 
only limit problem of pr. theory; we shall investigate it in Part III. 
Meanwhile there is an immediate question to answer: given a sequence 
£(X n ) of laws, do all the limit laws of convergent sequences of the form 
£,{a n + bnXn) belong to a same type? The answer, due to Khintchine 
for positive types, is as follows: 

A. Convergence of types theorem. If £,(X n ) —> £>(X) nondegen¬ 
erate and £(a n -f- b n X n ) —> £,{X') nondegenerate^ then the laws £(X) and 
<£(X r ) belong to the same type. More precisely, & {X') = £(a + bX) with 
II —> I b |, and if b n > 0 then b n —> a n —> a. 

However, for every finite a and for every sequence £(X n ) of laws, there 
exist numbers a n and b n 0 such that £(a n + b n X n ) —> £(a). 

In other words, given a sequence of laws, the changes of origin, scale, 
and orientation can yield in the limit no more than one nondegenerate 
type and can always yield in the limit the degenerate type. This shows 
once more that the degenerate type is to be considered as the ‘‘degen- 
erate part” of every type. 

Proof. The second assertion is immediate. For, by taking the num¬ 
bers c n sufficiently large so as to have P[| X n | $ fn] < - > 0, we obtain 

' n 

ri x n \ n i 

p ^ ——- ^ - < - — o 

. nc n ni n 


and, it follows at once, that £ 



—> £(0), so that £ 



The first assertion means that nondegenerate and e %uan f n {b n u) 


—> f{u) nondegenerate, « C imply existence of two finite numbers a 
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and b 9^0 such that /’(《) = e iua f{bu), u R. We can always select 
from the sequence b n a convergent subsequence b n ， t but its limit b may¬ 
be 0 or ±oo. If ^ = 0, then, since the convergence of ch.f.’s to a ch.f. 
is uniform in every finite interval, we have, for every fixed « d 及， 

1/ ⑷丨 =lim \/AK>u) I = |/(0) I = 1, 

n / 

so that, by 14 . 1 a, f is degenerate and this contradicts the assumption. 
Similarly, if b n i — > ±oo, then, replacing « by 7— , it follows that 

一 b n > 

\/{u) I = lim \f n > I = I /’ ⑼ | = 1, 

so that / is degenerate and this contradicts the assumption. Thus 
b n < —> b finite and different from 0 . On the other hand, for all u suffi¬ 
ciently close to 0, the continuous functions f{bu) and f{u) (with values 
1 for « = 0) differ from 0; and we have, for n' sufficiently large, 

ma„- e uan 'f n >{b n .u) f{u) 

e - - >■ -〆0 ， n —> 00 

MK'u) f{bu) 

so that lim e iuan ' exists and is finite for | « j S some «o > 0 . But then 
limsup \ a n >\ < ». Therefore, for any convergent subsequences of 
(a n >), a' n —> a' and a: —> a’’, we have e iu{a ^~ a，n，) ^ e iu{a '~ a " ) = 1 for 
I « I ^ « 0 . It follows, by 13.4 Application 3 °, that a' — a" = 0 hence 
a n > —> some a e R and/’(《) = e iua f(bu) t u C 
Clearly, it remains only to prove that | b n | —* \ b\. Let b n > —> b 
and b n " —* b' hence a n > a and a n ” 一 * a'\ it suffices to prove that 
if, for every «, e iua f{bu) = <_’/( 々 '《)，then | 6 | = 卜 ’ |. Upon replac¬ 
ing b'u by u and $ by it suffices to prove that, if | c | ^ 1 and, for 
b 

every u y |/(«) I 2 = |/M I 2 , then I c I =1. But I c I < 1 entails, upon 
replacing repeatedly u by 

|/(«) I 2 = \/{cu) I 2 = •.. = lim \f{c n u) I 2 = 1. 

Thus, the nondegeneracy assumption excludes the possibility | c | < 1 ， 
so that I c I = 1 and the proof is complete. 

Remark. It is immediately seen that if we limit ourselves to, say, 
positive types only, then, under the foregoing assumptions, a n a 
and b n —> b. We leave to the reader to find conditions under which 
this property remains valid for types. 
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Corollary. If y for every «, 

^ iuan MKu) /(«) and 〆 ％ ( 々 〜《)—/(«) 
where f is a nondegenerate ch.f. and b n b' n > 0 for every then 


一 a n 


0 and 


Replace in the theorem X n by a' n + V n X n . 

14.3 Extensions. The results and terminology of this chapter ex¬ 
tend at once to families of r.v/s, and we shall content ourselves with a 
few generalities. 

The law of a random vector X = {X^ …， Xn\ with d.f. Fx on R N 
is represented by the ch.f. fx on R N defined by the A^uple integral 

fx(u) = J e iux dF x (x)y ux = u x x x + … + u N x N 


or, explicitly, by 


iV-uple 


/x(«l ， …， 《 iV) = J* f et(UlXl + "' +UNXn) ^1^2* • ^nFx(Xu • • •, Xn). 

The integral which appears in the inversion formula becomes an A^uple 

X +Ui / »+Un e -iua _ e -iub 

•••I and the “kernel” - 

■Ui J- Un iu 

fj — tiifcafc —iuifik 

t 二 e — e 

becomes n -:- • 

ksstl l 仪 k 

We observe that there is a one-to-one correspondence between the 
law of the random vector X = [Xu … ， 足 v} and the laws of the r.v.’s 
uX = U\X\ H - 1- u^XNy where u varies over R N , since 

fx{tu) = fux{t)i t R 

and, in particular, /x(«) =/«x(l). 

Finally, the law of a random function X = { X t) t ^2 T] is the set of 
joint laws of all its finite subfamilies. 


§ 15. NONNEGATIVE-DEFINITENESS ； REGULARITY 

15.1 Ch.f.*s and nonnegative-definiteness. The class of ch.f/s has 
been defined to be the class of Fourier-Stieltjes transforms of d.f/s. 
Conversely, given a continuous function g on R, we can recognize 
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whether or not it is a ch.f. by applying the inversion formula: if the 
right-hand side of the inversion formula exists and is nonnegative for 
all pairs a < b oi finite numbers, then g is a ch.f. up to a multiplicative 
constant. If g is absolutely integrable on R, then it suffices to apply 
Corollary 1 of the inversion formula and verify that the function F' is 
nonnegative. A very important criterion of a different type is that of 
nonnegative-definiteness that we investigate now. 

Let ^ be a real or complex-valued function on a set D s CZ R obtained 
by forming all differences of the elements of a set S 〆 0 ; for example, 
S = [ 0 , U) and Ds = (—U ， -\-U) y S = set of all positive integers and 
Ds = set of all integers. Sets Ds are necessarily symmetric with respect 
to the origin « = 0 and contain it. We say that g on Ds is nonnegative- 
definite if for every finite set S n (Z S and every real or complex-valued 
function h on S n 

E giM — v)h{u) 7 i{v) ^ 0; 

C S n 

we shall omit mention of Ds when Ds = R. 

a. If g on Ds is nonnegative-definite^ then y for every u C Ds, 

兄 ⑼ 3 0 ， g(-u) = f(«), I g{u) I S 兄 ⑼ • 

I/ } moreover^ 3 ( — t/ ， +C 7 ) and g is continuous at the origin，then g 
if uniformly continuous on the set of limit points of Ds. 

Proof. We apply the defining relation with 

A 35 {0}，6*2 =* {0, «}， 6* 3 = {0, 《， 《'}• 

With 6*1 we obtain 兄⑼ ^ 0. It follows with S 2 that g{u)h{u) + 
g{—u)h{u) is real and hence g(—u) = g(u) (take h(u) — 1 and h(u) 
=» t). We use these two properties below. 

The discriminant of a nonnegative quadratic form being nonnegative, 
elementary computations with S 2 yield | g(u) | $ 兄 (0). For the last 
assertion we exclude the trivial case ^( 0 ) = 0 which implies ^ = 0, and, 
to simplify the writing, assume that 方 (0) = 1 (it suffices to replace 兄 
by g/g( 0 )). The same discriminant property but with 6*3 yields, by 
elementary computations, 

I giu) - g{u') I 2 S 1 — U (“ 一 《') I 2 — 2CR{!(«)《(《')(1 — 兄 (“ 一 《'))} 

Therefore, if g is continuous at the origin, that is, if g(u — u f ) —* g( 0 ) 
=1 as then g{u') —* g(u). The proof is complete. 
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The foregoing proposition shows that a nonnegative-definite func¬ 
tion g on R continuous at the origin has properties similar to those of 
ch.f.’s. In fact, g coincides on R with a ch.f. — up to a multiplicative 
constant; and this is what we intend to prove now. According to a, 
if ^(0) = 0, then 《 = 0 so that, by excluding this trivial case and di¬ 
viding by ^-(0) we can and will assume from now on that 兄 (0) = 1. 

b. Herglotz lemma. A function g on the set Ds = {• •. —2c, —c t 
0, +c, +2c, • • •} /j nonnegative-definite if, and only if, it coincides on this 

/ +Wc 

e tux dF{x). 

-ir/c 

Proof. We can assume that c > 0. If g on Ds is nonnegative-defi¬ 
nite, then, for every integer n and every finite number x, 


1 "- 1 / k \ 

G' n {x) = —- Z ( 1 - ) g{kc)e' 

^ — n+l\ W / 


=—E L g(U - h)c)e-^~ h)x ^ 0. 

Z 賞 ” j 33sl h sssl 

Upon multiplying by e xkx with some fixed value of k and integrating 
over [ — it, +ir)> we obtain 




gW 


e ikx G' n {x) dx 


e iikc)x dF n {x) 


where F n is a d.f. with F n ( — ir/c) = 0, F n (-{-ir/c) = g(0) = 1. The 
“only if” assertion follows, on account of the weak compactness and 
Helly-Bray lemma, by letting « —^ » along a suitable subsequence 
of integers. The “if” assertion is immediate (as below). 

A. Bochner’s theorem. A function g on R is nonnegative-definite 
and continuous if, and only if 、 it is a ch.f. 

Proof. The “if” assertion (Mathias) is immediate, since, if ^ is a 
ch.f. with d.f. G, then, letting u and v range over an arbitrary but finite 
set in R, 

V)h{u)l{v) = f { z e i(u - 9)x h(u)%(v)} dG(x) 

u,v ^ u,v 


\ 2 dG(x) ^ 0. 
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Conversely, let ^ on be nonnegative-definite and continuous. It co¬ 
incides on R with a ch.f. if it does so on the set S r (dense in R) of all 
rationals of the form k/2 n , k = 0, 士 1 ， ±2, •••,» = 1, 2, • • •. For 
every integer n, let S n be the corresponding subset of all rationals of 
the form k/2 n so that S n "f S T . Since g is nonnegative-definite on R, it 
is nonnegative-definite on every S n . Therefore, by b, there exist ch.f.’s 
/ n such that g(k/2 n ) = / n (^/2 n ) whatever be k and n. Since T *^r, 
it follows that/n —^ 兄 on <SV. Let 0 ^ 0, 0 n ^ 1, so that, by b, 

/ +7T 

(1 — cos dx) dF n (2 n x) 

-ir 



—cos x) dF n (2 n x )= 


1 - 6\g(l/2 n ). 


Therefore, by the elementary inequality \ a + b \ 2 ^ l\ a \ 2 l\ b \ 2 and 
the increments inequality, for every fixed h = (k n + 0 n )/2 n , 

I 1 -Mh) I 2 ^ 2| 1 -f n (k n /2 n ) I 2 + 4(1 - (R/n(dn/2 n )) 

^2|1 - g{k n /2 n ) I + 4(1 - (Sig{\/2 n )). 


Since g is continuous at the origin, it follows by 13.4, 2°, that the se¬ 
quence f n of ch.f.’s is equicontinuous. Hence, by Ascoli’s theorem, it 
contains a subsequence converging to a continuous function /, so that 
g =/ on S r and hence on R. Since by the continuity theorem / is a 
ch.f., the proof is complete. 

The “only if” assertion can be proved directly, and this dire'ct proof 
will extend to a more general case: For every T > 0 and x R 


Pt(.x)= 



g(u - v)e~ i{u - v)x du dv ^ 0, 


since, ^ on i? being nonnegative-definite and continuous, the integral 
can be written as a limit of nonnegative Riemann sums. Let u = v + t, 

integrate first with respect to v and set gr{t) = (1 一 g(^) o r 0 ac- 

cording as | /1 ^ T or | / | ^ T. The above relation becomes 


Pt(x) = f e- itx g T {t) dt ^ 0. 

Now multiply both sides by — e lux and integrate with 


re- 
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spect to x on (—X, -\-X). The relation becomes 



：0-^) 


pr(x)e tux dx 


if 




The left-hand side is a ch.f. (since its integrand is a product of e xux by 


a nonnegative function) and the right-hand side converges to gr(^) as 
X —> oo. Therefore, gr is the limit of a sequence of chX’s. Since it is 
continuous at the origin, the continuity theorem applies and gr is a 
ch.f. Since ^ ^ as T —> oo, the same theorem applies, and the as¬ 

sertion is proved. 


^Extension L The question arises whether in A continuity at the 
origin is necessary. Let ^ on i? be nonnegative-definite and Lebesgue- 
measurable. 

By integrating 

E 一 u k )e i{Ui ~ Uk)x ^ 0, x€R 

Uj,Uk C Sn 


with respect to every u S n over (0, T) y we obtain 



g(u - v)e i{u - v)x du dv ^ 0. 


-n + n {n - 1) 


Dividing by n{n — l)T n-2 and letting » —^ it follows that 


T „T 



g(u - v)e i{u - v)x du dv ^ 0. 


0 ^0 


Therefore, the direct proof of the “only if” assertion in A continues to 
apply, but instead of the continuity theorem use 12.2A Corollary 2, and 
we obtain g = f ch.f. almost everywhere (in Lebesgue measure). The 
“if” assertion is modified accordingly. Thus (F. Riesz) 


A' A function g on R is nonnegative-definite and Lebesgue-meas urable 
if, and only if, it coincides a.e. with a ch.f. 

* Extension 2. It can be shown that Herglotz lemma remains valid 
with D s = { -Nc, -{N - \)c, • • • ， 0 , … (N - l)c, Nc} whatever be 
the fixed integer N. Then, replacing S r and S n by their intersections 
with ( — C7, + U) whatever be the fixed U, the proof of A remains valid. 
Thus (Krein) 

A". A function g on (— U, +U) is nonnegathe-definite and continu¬ 
ous if, and only if, it coincides on (— U, +C7) with a ch.f . 
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Remark 1. The proofs of A and A" use only the fact that g is con¬ 
tinuous at the origin, so that these theorems imply the last assertion 
in a. 

Remark 2. The foregoing proofs show that in the definition of a 
nonnegative-definite g it suffices to take h{u) = e tux where x runs over 
R, Also if 兄 is Lebesgue-measurable, then the definition can be taken to 
be 

•p n 

g(u — v)e t( - u ~ v ^ x du dv ^ 0 

for every x R and a sequence T n —*■ 

According to the second extension, a function which coincides with a 
ch.f. on (— U, + U) can be extended to a ch.f. on R. The problem which 
arises is under what conditions this extension is unique. This is part 
of the problem we investigate in the following subsection. 



*15.2 Regularity and extension of ch.f.’s. According to 14.1a, if 
/ = 1 on an interval (— U, + U), then / = 1 on R. Also according to 
13.4, 3°, if/ n —> 1 on (— U, + U) then / n —> 1 on R. Thus, in these 
cases a ch.f. is determined by its values on an interval, and convergence 
of a sequence of ch.f/s on R follows from its convergence on an interval. 
We intend to investigate more general conditions under which these 
properties hold. To simplify the writing, we assume that the ch.f/s 
are those of r.v.’s, that is, take the value 1 at « = 0. 

a.. If f is the integral ch.f. corresponding to the ch.f. /, then 



it follows, upon applying the Schwarz inequality, that 


/(“ h) — f(u — h) 
2h 


=|i> Sj 


sin hx 


dF{x) 


含 f . 1 -± - C - °— dF(x) =^{1 + (R/WI. 
J 2 2 
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We extend now the uniform convergence theorem 13.2C. Let / n be 
ch.f.’s. 


b. If f n g on (— U, +L7) and g is continuous at u = 0, then the 

f n are equicontinuous and the comergence is uniform. 

Proof. Because of 13.4 (1°,2°) and Ascoli’s theorem, it suffices to prove 
that the/ n are equicontinuous at « = 0. If this conclusion is not true, 
then there exist an e > 0, a sequence n' —> », and a sequence u n > —> 0, 
such that \fn'(Mn') I < 1 — « for all n'\ given a positive h G (-U，+ U), 


we take m n > 


土 

M n > 


,so that 


u = kh and summing over k 


m n >u n > —> h. Upon applying a with 
—m + 1, —m + 3, • • •, w — 1, we 


obtain by the elementary inequality | «i + • • • + | 2 ^ | 2 + ••• 

+ m\ a m I 2 、 


It follows that 


jipih) — j{—mh) 
Imh 


^ ~ {1 + 況 / W }. 


2m n >u n > 


/ + mn'Un ， 

- ninflLn* 


dv 


2 - {I + (Sifn'M] < 1 


6 

2 


and, letting n' —> », we have 

+h 

g(v) dv 

-h 

Since 1 = / n (0) —> ^(0) and g is continuous at « = 0, it follows, letting 
h 0, that 1^1—^. Therefore, ab contrario, the / n are equicon¬ 
tinuous at « = 0, and the assertion is proved. 



- 1 _ 2 * 


A. Continuity theorem on an interval. If f n fu on (— U, 
+ U) and fu ^ continuous af u = 0 , then fu extends to a ch.f. f on R\ if 
the extension f is unique, then f n — f on R. 

Proof. According to b, the f n are equicontinuous. Therefore, by 
Ascoli’s theorem, the sequence / n is compact in the sense of uniform 
convergence and, since / n fu on ( — C7, +1/)> all its limit ch.f.’s co¬ 
incide with fu on (— C7, + U). It follows that, if there is only one ch.f. 
f which coincides with fu on (一 U, + t7), then / n —> / on R. 
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The second part of the problem raised above is reduced to its first part: 
find ch.f.’s determined by their values on an interval (-U, + U). A 
partial answer is given by the following theorem (Marcinkiewicz). 

B. Extension theorem for ch.f.’s. If the restriction fu of a ch.f. 
f to an interval (— U, +t7) is regular or is the boundary function of a 
regular function, then fu determines f. 

This theorem follows, by the unicity of analytic continuation, from 
the three propositions below of independent interest. Let /(z)= 

J*e xtx dF{x), where z = « + /o is a point of the complex plane R u X R v . 

a. f(z) is regular in a circle | z | < R if, and only if , for every positive 
r < x 1 dF{x) is finite. 

Proof. The “if*’ assertion is immediate and it suffices to prove the 
“only if” assertion. 

I«t 



If/(z) is regular for z j < /?, then, for every positive r < R, 


and, in particular, 

Since 

it follows that 

and, hence, 



This proves the assertion. 

b. If f{z) is regular in the circle \ z \ < R or in the rectangle | (Rz | < 17, 

I 3z I < R, then f{z) is regular in the strip \ 3z | < R. 

Proof. The first assertion follows at once from a. As for the second 
assertion, let V be the largest number such that /(z) is regular in the 
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circle | z | 〈厂 and assume that V < R. According to a, /(z) is regu¬ 
lar in the strip | 3z | < V. But it is also regular in the rectangle | (Rz | < 
U, I 5z I < R and, hence, in the circle whose radius equals min (/?, 
VlFTV 2 ). Therefore V cannot be less than R and the proof is 
concluded. 


For every ch.f./, we have f(z) = /+(z) + / — (z) where 


f + (z) 


e iiX dF{x) and 


dF(x) 


are regular for CJz > 0 and 3z < 0, respectively. Therefore, if, say, 
/ + (z) is regular for 0 > 32 > —R, then /(z) is regular forO > > —R，so 

that the ch.f. with values f(x) is the boundary function of a regular func¬ 
tion. Thus, the following proposition completes the proof of the fore¬ 
going extension theorem. 

c. / + (z) is regular for 0 > 3z > —R if, and only if t for every positive 
r < i?, C e TX dF{x) is finite. 


Proof. The “if” assertion is immediate. As for the “only if*’ asser¬ 
tion, we observe that, since/ + (z) is regular for 3z > 0 and continuous for 
3z ^ 0, regularity for 0 > 3z > —R implies, by a well-known sym¬ 
metry property, regularity for | 3z | < /? and, hence, according to a, 

I e rx dF{x) is finite for 0 < r < R. 

Jo 

Particular cases. Upon applying what precedes, we have 


1 。 I/fn(u) -> e iua on (- U, +t7), then / n ⑷ ― e iua jor every « G 

2 。 If f n 、 u) e~ T on (-U, +t7), then /„(«)-> e~ T for every 
u ^2 R. 

3 。 1/f n f on (-U, +t7) and f is ch.f. of a r.v. bounded either 

above or below, then f n —*/ on R. 

d. Unicity lemma. Let 5 -( 2 ) be regular for 3z > 0 and continuous for 
3z ^ 0. 

If giz) =/ + (z)/orz = 0 then g{z) =/ + (z)/orz ^ 0. 


For, h(z) = g(z) — / + ( 2 ) being regular for 3z > 0 and continuous for 
3z ^ 0 with h{z) = 0 for 2 =0 extends, by analytic continuation to an 
entire function vanishing for 2 = 0 hence vanishing everywhere. 

*15.3 Composition and decomposition of regular ch.f.’s. Let F de¬ 
note the composed F\ * F 2 of d.f.'s Fi and F 2 . In the case off or/ 1,/2 
regular, the composition theorem 13.4A can be completed as follows: 
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A. Composition theorem for regular ch.f.’s. /(z) is regular in the 
strip I 3z I < R if, and only 仏 /i(z) and / 2 (z) are regular in \ 3z | < R. 
This theorem follows at once, by 15.2a and b, from the 

Composition lemma. If F = F\ « F 2 then, for every v, 

dF(x) = je^dFM fe vx ^F 2 (x), 

and there exist finite numbers ay > 0 , ^ 0 such that 

J e vs dF{x) ^ aj e- 0ilvl je^dFjix), j=l,2. 

Proof. We exclude the trivial case of degenerate Fi or F 2 . The first 
assertion follows, using Fatou’s lemma, in a way similar to that of the 
proof of the composition theorem 13.3A, whether the integrals are finite 
or not. 

As for the second assertion, for every either 

^00 

Je^dFM ^ J e^dF^x) ^ +«) 

or 

dF^x) ^ J e vx dF!(x) ^ e^Fxib), 

according as 0 ^ 0 or 0 < 0. Let 02 be the larger of two finite numbers 
I bi I and | b 2 \ such that 

ax = Fi[bi, +00) > 0 and a 2 = Fi(l> 2 ) > 0 

and let 02 be the smaller of a\ and - Then the inequalities above and 
the first assertion yield 

Je vs dF(x) ^ a 2 e~ M ^ Je vx dF 2 (x) 

and the proof is complete. 


COMPLEMENTS ANB DETAILS 

Unless otherwise stated, functions F, with or without affixes, are d.f.’s of 
r.v/s:F(— 00 ) = 0, F(+°°) = 1, and functions /, with same affixes if any, are 
corresponding ch.f.’s. 

/• If F is purely discontinuous and the discontinuity set is dense in R y then 
the nondecreasing inverse function is singular. 

2. If Fx n — Fx and fx is any limit point of the sequence ii(X n ) of medians 
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of the X ny then /x is a median of X. In particular, if fi(X) is the unique median 
of X y then ^(X n ) —> (Take < /x < x n to be continuity points of F, 

then F(x f ) ^ 

3. P. Livy's space. Let 5 be the space of all d,f/s F of r.v/s. Set d{F y F f ) 
to be the infimum of all those h for which F(x 一 A) — A 彡 F f (x) ^ F(x + A) + A 
whatever be x R. 

(a) Draw a graph and interpret d{F y FO geometrically by considering lengths 
of segments intercepted by the graphs of F and F r on parallels to the second 
bisector. 

(b) The function d so defined is a distance, and (JF, d) is a complete metric 
space. 

(c) The following three assertions are equivalent: 

Fn ^ F, d{F n , F) — 0 ， jgdFn 


for every function g continuous and bounded on R. 

(d) A set 5 in 5 is compact if, and only if, F(x) — 0 as x 
F(x) —>• 1 as x — +oo, uniformly on S. 

4. Establish the following correspondences for laws. 

Binomial: p k = C n k p k q n ~ k 9 々 $”，/(«)= (pe iu + q) n . 

入 A: 

Poissonian: p k = — k = 0 9 1, …， /(«) = 


•oo and 


Uniform: F f {x )= 


in (a y b) y and 0 outside, /(«) 


^ibu _ giau 

i{b — a)u 


Cauchy: F\x) = - 2 -： - trj , ^ > 0, /(«) = 

tc a z + {x — by 

Laplace: F\x) = a > 0, /(«) = (1 + aV)— 1 〆 ⑽. 

2a 


Normal: F f (x) = > 0, /(«)= 

Squared Normal: (w = 0, cr = 1): F f (x) = 
x^0 9 /(«) = (1 -2iu)-\ ^ 


e~ x/2 for x > 0 and = 0 for 


T-type: F\x) = — —— x y ^ l e'~ ex for x > 0 y c > 0 y 7 > 0, 0 for x ^ 0, /(«) 
I Cy) 

* ~ t) 

5. The composed F of F with the uniform distribution on ( 一 A, +h) is given 

F{x) = U :: F (纯办 
An absolutely convergent inversion integral follows: 

h dy ~Th X-2 / w dysa l f-X^ir) e _ iux O 

Deduce the continuity theorem. 
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6. Let MhJ 


) /iu) \ ， ^ dUth> 0 t and let 


Mj = lim — f \f{v) I 2 血 

ti -♦ oo 2« J _ u 

(a) Mhf is nondecreasing in h and converges to 1 or Mj according as A oo 
or A — 0. lim lim Mh( IT A) is either 0 or 1 (identically in h). 

(b) Mj = $2 Pk 2 where the pk are jumps of F; M/ifz ^ MhJ = 

lj (1 - 会 ) dF 9 (x) where F 6 is d.f. with ch.f. f a = |/| 2 . (The sum is the 

jump at 0 of £(X) * £(—X) where X is sl r.v. with d.f. F.) 

(c) If f n — / with Mj = 0, then MJ n 0; the converse is not necessarily 

true. If II A — > /, then M(Jl/ k ) —> Mj. 

k^t\ k «>1 • 

7. A law is a “lattice” law if the only possible values are of form a ns only, 
j > 0; w = 0 ，士 1, • • •; if j is the largest possible, then s is the “step” of* the 
law. The step is well determined. 

(a) A law is a lattice law if, and only if, |/(«o) | = 1 for an « 0 〆 0. The step 
s is given by the property that |/(«) | < 1 in 0 < | « | < 2ir/s and J{2tc/s) = L 

(b) Let p n = P[X = a + ns] where X has a lattice law with step s. Then 


s f+ 賞 / 

Pn= 2 ^J-r / s 
F(x 2 ) - F( Xl ) = 念/二 


7(«) du. 


+ ir/« ^ — iUXl 一 ^ — iUX2 


/(«) du 


2 / sin 


where X\ = a -\r ms — \s y X2 — a ns -\r \s y n ^ m. 

8. If the moment m k exists and is finite, then 


log/(«) = Yj TT ( iu ) k + °( uk ) - 

k^l 


The ak are called semi-invariants; formally 


5^! 2 " 


log E 
» =0 n • 


Deduce the expression of a few first semi-invariants in terms of moments, and 
conversely. Prove that 

I 办丨刍 

(log 52 yT 2* is majorized by (〆’** 一 1)') 

fc-l k»l k 

9. If the derivative F f on R exists and is finite, then/(«) — 0 as | | 

(Use Riemann-Lebesgue lemma.) 



230 


DISTRIBUTION AND CHARACTERISTIC FUNCTIONS [Sec. IS] 


If the «th derivative F (n) on R exists, is finite, and is absolutely integrable, 
then _/(«) = o(| « | 1_n ) for I « I —> oo. (Integrate by parts.) 

10. Let ^ be a r.v. with d.f. F. 

(a) If P[| ^ I ^ x] —> 0 as ^ > oo faster than any power of 尤一 1 , then all 

moments exist and are finite. (Integrate by parts J*| ^ | n dF(x).) 

A pr. law is determined by the sequence of moments assumed finite if the 

series has a nonnull radius of convergence p. (Use Schwarz’s inequality 

to show that the series with the m n replaced by /x n majorizes the expansion of 
/ about any value of u y and then use analytic continuation.) 

(b) Formally, by integration by parts, 

/(z) = 1 — izj e izx F{x) dx + izj e Ux (l — F(x)) dx. 

If P[\ X\^ x] ― > 0 as x —> oo faster than e~ rx for every positive r < p, then 
/"(z) is analytic in the strip | 3z | < p. If p = °o, then /(z) is an entire function. 

(c) \(e w F\x) ^ r > 0 on for an r < 2 > then the pr. law is not determined 
by its moments. 

11* If/' exists and is finite on R y J* | x | dF{x) may be infinite: take 


^ cos nu 
C n -2 n 2 log n ' 


(The differentiated series converges uniformly but 


52 1/w log n = oo.) Let 


x dF{x) be the “symmetric” first 


moment. If m f exists and is finite,/^) may not exist: take a Weierstrass non- 
differentiable function a n cos b^u. 

If the derivative at « = 0 of (R/* exists, then 


r +1 /办 

o(\) + / x dF{x) y 0<h 

J — i/h 


(Set G(x) = F(x) — G(x) ， H{x) = F(x) + F (— x)，so that | AH | ^ AG. Show 

, r sin 2 (hx/2) ，广 ， 、 y ^ r 00 sin (hx) , ry/ . …、 

that J --- dG{x) — 0 as A — 0, j -- dH{x) = o(l).) 

Under the foregoing condition, J f exists and is finite if, and only if, 

x dF{x) exists and is finite, and then /’(0) = im\ Extend to 

—a 。 

any derivative of odd order. What about those of even order? 

12. If ^ on i? is not constant and g{u) = 1 + o(u) + o(u 2 ) near u = 0 with 
o(u) an odd function, then g is not a ch.f. (Observe that g(u)g( — u) =» 1 + o(u^).) 
Examples: 厂 for r > 2, 1/(1 + tt 4 ). 

IS. Let g on R be real, even and continuous, with ^(0) = 1, g(u) — 0 as 
—> 00 . 

If g is convex from below, on [0, +°o), then ^ is a ch.f. (To prove 

J? (“) cos xu du^ 0 for x > 0; observe that on [0, oo), say, the left-hand side 
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derivative g f exists and is nondecreasing, with g\u) ^ 0 and g f {u) 
« —> °o. Set h = 一 〆 so that, by integration by parts, 


广 00 广 00 

x I g(u) cos xu du = \ h{u) sin xu du. 


For > 0, the last integral is 


I A(«) 一 A (“ + —) + + — ) — A (« + — ) + • • • } sin g O*) 

Examples: 1/(1 + | « |)， 1 •一 | | for | « 丨蠤 1 and 0 for | « j > 1. 

14. (a) Two ch.f/s may coincide on intervals without being identical. 

/ \ 1 GOS X . « / % I I 一 I I ^ « « a 


hence fi{u) 


Take F\{x) = —— hence fi(u) = 1 一 | “ | for \ u \ ^ l and 0 for 

I “ I > 1， and take F 2 defined by p Q = p±r( 2 k^i) = f)2 ;/ 2 (“） is P eri - 

odic of period two and coincides with j\ on [ — 1, +!]• Or, take/ to be a ch.f. of 
the type described in 13 with J f continuous and strictly increasing on [0, oo). Re¬ 
place two arbitrarily small arcs of the graph of / which are symmetric with 
respect to the j-axis by their chords, and compare the function so defined with/. 

(b) The compositions of a law with either one of two distinct laws may coin¬ 
cide (fifi = /if 2 ) - 

(c) If f n — / on [ — 17, +17], the same may not be true on R. 

15. / on 尺 is a ch.f. if, and only if, there exists a sequence g n such that 

J I g n (v) | 2 ^ > 1 and J g n (u + v)g n (v) dv —> f{u) uniformly in every finite 
Interval. 

(For the “if” assertion, observe that every integral is positive-definite. For 
the “only if” assertion, divide [ — +n] into n 2 equal subintervals, set F n (—”) 
= 0, F n (n) = 1, F n =* F at the subdivision points, and linear inside every sub¬ 
interval; set Cngniu) = I y/e iux dx with ^ n (0) =» 1. Compute f n and 
observe that/n — /•) 

16. Let g and h be bounded and continuous on R y with g{u )= 《(一 “)， and 
let \{u) be an arbitrary finite function on R. 

If for every finite set A of values of u 

1 基基 - o)\(u)\(v) I 气系 石 々(《 - o)X(«)X(t;), 

then 

Ku) =]> x dH{x) 

where H is 3 . d.f. up to a multiplicative constant. 

The foregoing inequalities represent a necessary and sufficient condition for 
g to be of the form 

g(u) = J e iux dG{x) 

with I AG I S A/f. Find the relation between discontinuity and continuity 
points of G and H. 
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/7. The uniqueness and composition properties determine “essentially” the 
form of ch.f/s. Let K on R X R bounded and continuous. If the functions 

g on R are defined by g(u) =^K(u iX )dF{ X ) for every d.f. F ，and the unique¬ 
ness and composition properties hold, then K(u y x) = c ixh{M) and f{h(u)) = g{u). 
18. Normal vectors. A normal vector X = (Xk y 走彡 w) is so defined that all 

r.v/s of the form u kXk are normal. Let the Xk be centered at expectations. 

k 

A ch.f./on R n is that of a normal vector (centered at its expectation) if，and 
only if, 

log/(“l, • • •, «n) « Q(U1 } •••,〜) =-h Z ^ikUjUk ^ 0 
where = EXjXk- 

If the inequality is strict, then the normal d.f. is defined by 


d Xl ••• d Xn 


F(x u 


=(2x) n / 2 D H<f 




where D = || m jk || > 0 and^(^i, •••，〜）= 圣 H Dj k XjXk is the reciprocal form 
of Q{u \ y ， • n) with the variables Xk- What if 0 2 0? 


19. If (X } Y) is a normal pair centered at expectations, then EXY/aXaY = 
cos 辦 where p = P[XY < 0]. (Compute P[XY < 0] using the d.f.) 




Part Three 


INDEPENDENCE 


Until very recently, probability theory could have been defined to 
be the investigation of the concept of independence. This concept con¬ 
tinues to provide new problems. Also it has originated and continues 
to originate most of the problems where independence is not assumed. 

The main model is that of sequences of sums of independent random 
variables. ■ The main problems are the Strong Central Limit Problem 
and the (Laws) Central Limit Problem. The first is concerned with al¬ 
most sure convergence and stability properties. The second one is 
concerned with convergence of laws. All general results were obtained 
since 1900. 
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SUMS OF INDEPENDENT RANDOM VARIABLES 


Two properties play a basic role in the study of independent r.v.*s: 
the Borel zero-one law and the multiplication theorem for expectations. 
Two general a.s. limit problems for sums of independent r.v.’s have been 
investigated: the a.s. convergence problem and the a.s. stability prob¬ 
lem. Both of them took their present form in the second quarter of 
this century., 

§ 16. CONCEPT OF INDEPENDENCE 

Convention. To avoid endless repetitions, we make the convention 
that, unless otherwise stated, 

— r.v.’s, random vectors and, in general, random functions are de¬ 
fined on a fixed but otherwise arbitrary pr. space (J2, a, P). 

— indices / vary on a fixed but otherwise arbitrary index set T, and 
events of a class have the index of the class. 

16.1 Independent classes and independent functions. Events A t are 
said to ^independent if, for every finite subset (/!，• • •, / n ), 

a) p n = n PA lk . 

k=l 

In fact, the concept of independence is relative to families of classes 
(see Application 1° below). 

Classes of events are said to be independent if their events are inde¬ 
pendent; in other words, if events selected arbitrarily one from each 
class are independent. Clearly, if the Q t are independent so are the 
Q f t > C Q t >, t' ^_T'd T. Because of its constant use, we state this fact 
as a theorem. 
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A. Subclasses of independent classes are independent. 

Let X t be r.v.’s or random vectors or, in general, random functions. 
Let (S>{X t ) be the sub <r-field of events induced by X t , that is, the in¬ 
verse image under X t of the Borel field in the range space of X t . 

The X t are said to be independent if they induce independent <r-fields 
(R(X t ). Then classes (B« C (R(X t ) are independent. Since a Borel 
function of X t induces a sub <r-field (B t of events contained in (R(X t ) y it 
follows that 

A'. Borel functions theorem. Borel junctions of independent ran¬ 
dom junctions are independent. 

Independent classes can be enlarged, to some extent, without de¬ 
stroying independence. More precisely 

Let Q t be independent classes. Independence is preserved if to every 
Q t we adjoin 

1° the null and the a.s. events 、 for (I) is trivially true — both sides 
reducing to 0 — when at least one of the events which figure in it is 
null, while (I) with n indices reduces to (I) with fewer indices when at 
least one of the events which figure in it is a.s.; 

2° the proper differences of its elements and, in particular^ their com¬ 
plements (because of 1°), for if A ti 3 A 'then 

P{^ti — - - - A tn = PA lx A h • • • A tn — PA' h A H …成 

= {PA h - PA' tl )PA h …尸成 
= P{A h — A' t )PA h - - - PAt^ 

3° the countable sums oj its elements, for 

P(Z A t i)A t , ^ = E PA t ； A h ••• A 

i i 

=(E PA t i)PA h …尸人 

3 

= 尸 (E A t ； )PA h • - • PA^ 

J 

4° the limits oj sequences of its elements y for if — > A ix as w 一 

then 

PAt x A t2 - • - ^ PAt^ 1 At 2 • - - A ln 

= PA h m PA h … PA tn — PA tx PA h - - - PA in . 


It follows easily that 
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B. Extension theorem. Minimal a-fields over independent classes 
closed under finite intersections are independent. 

Applications. 1° If the events A t are independent, so are the <r- 
fields A t e y 0, fi). 

2° If the inverse images G t of the classes of all intervals ( —oo, x t ) 
in Borel spaces R t are independent, so are the inverse images (B t of the 
Borel fields in the R t . For, every e« is closed under finite intersections 
and (B t is the minimal <r-field over Q t . 

*3° Let (B t be <r-fields (or fields) of events and let T s be a subset of 
the index set T. The compound <r-field (B r . with components (B t , / C T S) 
is the minimal <r-field over the class Q T , of all finite intersections of 
events A t , t C T Si and contains all its components; since the (B t are 
closed under finite intersections so is Qt.- is a compound sub <r-field 
of (S>t and, if T a is finite, then (S> Tt is a “finitely compound” sub <r-field. 

If compound <r-fields are independent, then, by A, their finitely com¬ 
pound sub <r-fields are independent. Conversely, if the finitely com¬ 
pound sub <r-fields are independent, then, by the extension theorem, the 
compound <r-fields are independent. We state these facts as a theorem. 

C. Compounds theorem. Compound a-fields are independent if, and 
only if i their finitely compound sub a-fields are independent. 

In particular, if the (B t are independent, so are the (Br, for every 
partition of T into set T s . 

Families X Tt = \X ti t C T,) of r.v.’s induce sab cr-fields (R(Xt,) of 
events. Every (S>{Xt^) is the minimal <r-field over the class of 

inverse images of all intervals in the range space Rt, of Xt,- But the 
intervals in the Borel space Rt, are products of intervals in the factor 
spaces Rt, with only a finite number of factor intervals different from 
the whole factor spaces, and the inverse image of any factor space is 
Therefore the elements of Q(Xt,) are all the finite intersections of ele¬ 
ments of the (R(X t ). It follows that the <r-field (R(Xt,) is a compound 
of the <r-fields ®(D, and theorem C becomes 

C'. Families theorem. Families of random variables are indepen¬ 
dent ify and only if, their finite subfamilies are mutually independent. 

Thus, in the last analysis, independence of random functions reduces 
to independence of random vectors. 

To conclude this investigation of the definition of independence, let 
us observe that all which precedes applies to complex r.v.’s, to com¬ 
plex random vectors, and, in general, to complex random functions 
X t = X\ -f iX" t considered as vector random functions {X't y X” t ), 
fCT. 
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16.2 Multiplication properties. The direct definition of independent 
r.v.’s is as follows: 

Random variables X t) t d are independent if, for every finite class 
(S h) • • •, StJ of Borel sets in R, 

p n g = n p[x tk c 

The basic expectation property of independent r.v.’s is expressed by 
a. Multiplication lemma. If X\, …， X n are independent non- 

71 71 

negative r.v. f s y then £ II 石 ； =II EX^ 

1 k^l 

Proof. It suffices to prove the assertion for two independent r.v.’s 
X and Y, for then the general case follows by induction. First, let X = 
52 XjI Ai and y = 52 yklBk be nonnegative simple (or elementary) 

r.v.’s; we can always take the Xj, and, similarly, the yk, to be all dis¬ 
tinct, so that Aj = [X = xy], 5* = [y = yk]- Since X and Y are inde¬ 
pendent, PAjBh = PAjPBh and, hence, 

EXY = E x jyk PJjPB k = E xjPJj - ZykPB k = EXEY. 

j.k j h 

Now, let X and Y be nonnegative r.v.’s and set 

r/ - 1 i 1 [k - 1 々 1 

; L 2 n ~ 2 n \ L T 2 n 」 

Since X and Y are independent so are these events and, hence, so are 
the simple r.v.’s 

— 1 k - \ 

= Z - lA n p y n = Z— 

But 0 ^ I X, 0 ^ y n t Y, so that 0 ^ X n Y n | XY and, by what 
precedes, EX n Y n = EX n EY n . Therefore, by the monotone conver¬ 
gence theorem, EXY = EXEY, and the lemma is proved. 

A. Multiplication theorem. Let X\, X n be independent r.v.'s. 

n n 

If these r.v.'s are integrable so is their product 、 and £ H = II 

jfe S*» 1 1 

Converselyi if their product is integrable and none is degenerate at 0, then 
they are integrable. 
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Proof. It suffices to prove the assertion for two independent r.v.’s 
X and Y. We observe that independence of X and Y implies that the 
nonnegative r.v/s X' = X + or X~ or [ X | and Y' = Y + or Y~ or 
I Y\ are independent, so that, by a, EX'Y' = EX'EY'. Now, if X 
and Y are integrable so are X' and Y' and, by the foregoing equality, 
so is X'Y'. Therefore \ XY | and hence XY are integrable and, by the 
same equality, 

EXY= E(X+ - X~)(Y + - Y~) 

= EX+EY+ - EX + EY~ - EX~EY + + EX~ET~ 

= EXEY. 

Conversely, if XY is integrable so that E\X\E\Y\ = E\ XY\ < oo } 
and neither X nor Y degenerates at 0 so that E\ X | and _E| Y | do not 
vanish, then E\ X | and E\ Y | are finite, and the proof is concluded. 

Extension. The multiplication theorem remains valid for independ¬ 
ent complex r.v/s Xk = + iX"k, since it applies to every term of 

n 

the expansion of JI (X\ + iX M h). In particular, according to the 

Borel functions theorem, if the Xk are independent so are the e tuXk 
and, hence, n 

»u 2 Xk n n 

Ee = £11 ^ uXk = II Er Xk . 

In other words, 

Corollary. Ch.f's of sums of independent r.v.'s are products of ch.f.'s 
of the summands. 

This proposition, to be used extensively in the following chapter, is but 
a special case of a property which can serve as an equivalent definition 
of independent r.v/s, as follows : 

Let F t and f ti F h ... tn and be the d.f.’s and ch.f.’s of the r.v. 

X t and of the random vector (X t ,... X tn ), respectively. 

B. Equivalence theorem. The three following definitions of inde¬ 
pendence of the r.v.'s X t are equivalent. 

For every finite class of Borel sets St and of points Ut R 

(11) p n c s tk ] = n p[x tk c s tk i 

(12) * * 9 y X, n ) = 

(1 3 ) •••，“、） = fh(^t x ) … 人 (吣 
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Proof. (Ii) implies (I 2 ) by taking S t = ( — x t ). Conversely, (I 2 ) 
implies (Ii) with S t = ( 一°°， x t ) and, on account of 16.1 ， Application 2 °, 
this implies (Ii) for all S t . 

(I 2 ) implies (I 3 ), for (I 2 ) implies (Ii) which implies (I 3 ) exactly as 
the multiplication theorem implies its corollary. Conversely, (I 3 ) im¬ 
plies (I 2 ), for the inversion formula for one- and multi-dimensional 
ch.f.’s shows at once that if (I 3 ) is true, then, for all continuity intervals, 

...’〜；〜，...，O = Fti[ a t l} ^h) • * • FtS a tni ^n)> 

and (I 2 ) follows by letting the — 00 and b t | x t . This completes 

the proof. 

Extension. The equivalence theorem is valid when the X t are ran¬ 
dom vectors, for the proof applies word by word, provided R is replaced 
by the range space Rt of Xt- 

16.3 Sequences of independent r.v.’s. At the root of known a.s. 
limit properties of sequences of independent r.v.’s lies the celebrated 

A. Borel zero-one criterion. If the events A n are independent^ 
then jP(lim sup A^) = 0 or 1 according as JZ PA n < 00 or = °o. 

■Proof. Since 

n n 

P(lim sup A n ) = lim TO lim n P \J Ak = lim m lim n (1 — 尸门 ^h c ) 

k^m k=m 

and the events A n and hence An are independent, the assertion fol¬ 
lows by passing to the limit in the elementary inequality 

n n n 

1 - exp [- E P^k] ^ 1 - II (1 - P^h) ^ E P^k- 

k=m ksssm k=»m 

Since, whatever be the events A ni JZ < 00 implies that 

n n 

lim m lim n P \J Ah ^ lim m lim n JZ P^k = 0, 

the “zero” part of this criterion is valid with no assumption of inde¬ 
pendence ： 

a. Borel-Cantelli lemma. If JZ PA n < then ^(lim sup A„) = 0. 

Corollary 1. If the events A n are independent and A n —> then 

PA = 0 or 1. 

Corollary 2. If the r.v.'s X n are independent and X n — > 0, then 
52 jP[| X n I ^ f] < oo whatever be the finite number c > 0. 
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For X n 0 implies that, if A n = X n \ ^ c], then P(lim sup yf n ) 
= 0, and independence of the X n implies that of the A n . 

Because of its intuitive appeal, instead of “lim sup An' we shall 
sometimes write A n i.o .’’； to be read l< A n 's occur infinitely often” or 
“infinitely many A n occur.” This terminology corresponds to the fact 
that lim sup A n is the set of all those elementary events which belong 
to infinitely many A n or, equivalently, to some of “the A ni A n+X , --- 
however large be n'' — the “tail” of the sequence A n . To the “tail” of" 
the sequence A n of events corresponds the “tail” of the sequence Ia„ 
of their indicators. More generally, the “tail” of a sequence X n of 
r.v.’s is “the sequence X n , X n+ i, - - - however large be n" 

To be precise, let X\, X 2 , … be a sequence of r.v.’s and let (R(X n ), 

®d ， ^n+l)y • • *> ®d ， ^n+li . • • )， ®d+l, ^n+2> •••),.•• be 
sub e-fields of events induced by the random functions within the brack¬ 
ets. We give a precise meaning to lim sup (B(X n ), as follows : The se¬ 
quence (B(X n ), (B(X n , X n+ i), … is a nondecreasing sequence of <r- 
fields, its supremum or union is a field, and the minimal o--field over 
this field is (R(X ni X n +u • • •) or, writing loosely, “sup(B(JT m ).” In 

turn, the sequence (R(X n , X n +u •••) ， (B(JT n +i, X n+2i ...), ••• is a 
nonincreasing sequence of o--fields and its limit or intersection is a <r- 
field Q contained in (S>(X n , X n+ii …） however large be n or, writing 
loosely, “lim sup The tr-field 6 will be called the tail cr-field 

of the sequence X n or ‘‘the sub o--field of events induced by the tail of 
the sequence Xn** Let us observe that all the foregoing o--fields and, 
in particular, the tail tr-field, are contained in the o--field ©(Jfi, X 2 ,...) 
induced by the whole sequence X n . The elements of the tail o--field Q 
are tail events and the numerical (finite or not) ©-measurable functions, 
that is, those functions which induce sub o--fields of events contained in Q 
are tail functions — they are defined on the “tail” of the sequence. For 
example, the limits inferior and superior of the sequence X n and of the 
sequence (X\ -\- X 2 -\-- — h X n )/b ni where b n —> 00 , are tail functions 
(not necessarily finite), while the sets of convergence of these sequences, 
as well as the set of convergence of the series ^ X», are tail events. 

To Borers result corresponds the basic Kolmogorov’s 

B. Zero-one law. On a sequence of independent r.v.'s, the tail events 
have for pr. either 0 or 1 and the tail functions are degenerate. 

In other words, the tail <r-field of a sequence of independent r.v.’s is 
equivalent to {0, J2}. 
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Proof. We observe that an event A is independent of itself if, and 
only if, PAA = PA - PA, that is, if PA = 0 or 1 — and such events are 
mutually independent. Thus, the first assertion means that the tail 
<r-field (3 of the sequence X n of independent r.v.’s is independent of it¬ 
self. Since Q C (B(^T n+1 , X n+ 2 , ...) whatever be n and, because of 
the independence assumption, 0( 不， ... ， X n ) is independent of ®(Z n+1 , 
X n + 2 i ...)，it follows that 6 is independent of ©(^ 1} X 2 , • • •, X n ) 
whatever be n. Therefore, Q is independent of «(^ 1} X 2} ...) and, 
being contained in X 2f ...)，it is independent of itself. This 

proves the first assertion and the second follows, since, if is a tail 
function, then it is a.s. {0, Q}-measurable hence degenerates. 

Corollary. If X n are independent r.v.'s, then the sequence X n either 
converges a.s. or diverges a.s.\ and similarly for the series ^2 X n . More- 

over y the limits of the sequences X n and H - + X n ) /b n where b n | oo, 

are degenerate. 

*16.4 Independent r.v.’s and product spaces. Let X h where / runs 
over an index set T, be independent r.v.’s with d.f.’s Fx t on R t . Be¬ 
cause of the correspondence theorem, every Fx t determines a pr. P Xt 
on the Borel field in Rt. On account of the product-measure theo¬ 
rem, the P x , determine a product-measure H P Xt on the product Borel 
field II i n the product space XI R t . On the other hand, the law of 
the family X = {X t , / G ^}, represented by the family of d.f.’s 
of all finite subfamilies of X determines, by the corre¬ 
spondence theorem, a family of consistent measures on 

N 

the product Borel fields H Owing to the consistent measures theo- 

k 爾 I 

rem, this family of pr.’s determines a pr. Px on JI 
Since the Xt are independent, 

Fx tl - - .x tN = Fx h X • • • X F Xtff 

so that 

= Px tl X ••• X Pxt N 

and, therefore, Px coincides with H Px r In other words, 

A. The pr. space induced on its range space by a family of independent 
r.v.’s is the product of pr. spaces induced on their respective range spaces 
by the r.vJs of the family. 

Let us observe that this reduces the multiplication theorem to the 
Fubini theorem. 
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The question arises whether the converse is true: Given a product 
pr. space Q1 R ti II ©<, JI P t ), is there a family {X ti / C T'} of inde¬ 
pendent r.v.’s on some pr. space (0, a, P) which induces this product 
pr. space? Equivalently, given a family t £. T\ of d.f.’s with 
variation 1, is there a family {X t , t ^_T) of independent r.v.’s with 
F Xt = FJ 

If the pr. space on which the r.v.’s have to be defined is fixed, then, 
in general, the answer is in the negative, since on a fixed pr. space even 
one r.v. with a given d.f. might not exist. However, if we are at lib¬ 
erty to select the pr. space on which to define r.v.’s, and we shall always 
do so, then the answer is in the affirmative, as follows: 

Let the pr. space be the product pr. space (XI Rty II II Pt) where, 
if the F t are given, the P t are determined upon applying the correspond¬ 
ence theorem. The r.v.’s X t , defined on this pr. space by X t (x) = x ti 
x = / C T), are then independent, since their pr.d.’s are Pt and 

their d.f.’s are F t . Thus 

B. The relation X t (x) = x ty x = [x ti / C T'} establishes a one-to-one cor¬ 
respondence between families [Xt\ of independent r.v's and product pr. 
spaces on II R%. 

Remark. There exist pr. spaces on which can be defined all possible 
families of independent r.v.’s with a given index set T. For example, 
take the pr. space {Q, Cl, P) where Q = Qt with S2 t = (0, 1) and 
P = JI on the Borel field Cl in Q, with P t being the Lebesgue measure 
on the Borel field in (class of Borel sets in £2<). Then the r.v.’s Xt — 
inverse functions of arbitrarily given d.f.’s Ft — are independent and 

Fx t = F t . 

Extension. The preceding considerations apply, word for word, to 
random vectors. They also apply to arbitrary random functions, pro¬ 
vided we consider that the d.f. of a random function is defined in terms 
of its “finite sections,” that is, the family of d.f.’s of projections of the 
random function on finite subspaces. 

§ 17. CONVERGENCE AND STABILITY OF SUMS ； CENTERING AT 
EXPECTATIONS AND TRUNCATION 

This section and the following one are devoted to the investigation 

n 

of sums S n = ^2 Xk of independent r.v.’s X 1} X 2 , … and, especially, 

k^\ 

of their limit properties — convergence to r.v.’s and stability. 
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Given two numerical sequences a n and b n | oo, we say that the se¬ 
quence S n is stable in pr. or a.s. if » 0 or —> 0. In 

' J bn b n 

fact, a stability property is at the root of the whole development of pr. 
theory. If X 2 , - - - are independent and identically distributed in¬ 
dicators with P[X n = \] = p and P[X n = 0] = q = \ — p f we have 
the Bernoulli case. The first stability property is the 

P 


Bernoulli law of large numbers: In the Bernoulli case 


rt 


0 . 


The Central Limit Problem, to which the following chapter is devoted, 
is the direct descendant of its sharpening by de Moivre and by Laplace. 
On the other hand, the following strengthening 

Borel strong law of large numbers: In the Bernoulli case 


^ - p 0, 

n 

is at the origin of the results given in this chapter. Perhaps the im¬ 
portance of the methods overwhelms that of the results and emphasis 
will be laid upon the methods. These methods are (1) centering at ex¬ 
pectations and truncation and (2) centering at medians and symmetri- 
zation. 

17.1 Centering at expectations and truncation. We say that we cen¬ 
ter X z.t c if we replace X by X — c. If X is integrable, then we can 
center it at its expectation EX and, thus, X is replaced by — EX. 
In other words, a r.v. is centered at its expectation if, and only if, its ex¬ 
pectation exists and equals 0. 

Let X be integrable. The second moment oi X — EX is called 
variance of X; it exists but may be infinite and will be denoted by cr 2 X. 

ThuS <x 2 X= E(X - EX) 2 = EX 2 - (EX) 2 . 

Since, for every finite c y we have 

a 2 {X - c) = E(X - c - E(X - c)) 2 = E(X - EX) 2 , 
centerings do not modify variances. 

The importance of variances is due to the fact that we have at our 
disposal bounds, in terms of variances of summands, of pr/s of events 
defined in terms of sums *S* n of independent r.v/s; we shall find and use 
such bounds in this section. However, variances can be introduced 
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only when the summands are integrable. Moreover, the bounds men¬ 
tioned above are nontrivial only when the variances are finite. This 
seems to limit the use of such bounds to square-integrable summands. 
Yet this obstacle can be overcome by means of the truncation method. 

We truncate X zt c > 0 (finite) when we replace Z by Z c = Z or 0 
according as | Z| < c or \X\ ^ c t and X c is X truncated at c. It fol¬ 
lows that, if F is the d.f. of X, then all moments of X c 

EX C = f xdF 、 E(X C ) 2 = f x 2 杷 etc .， 

J\x\<c J\x\<c 

exist and are finite. We can always select c sufficiently large so as to 
make P[X ^ X c ] = P[| ^ c] arbitrarily small. Furthermore, we 

can always select the cj sufficiently large so as to make P U [Xj ^ Xj Cj ] 
arbitrarily small, since, given € > 0, we have 

P U [^y ^ X^] ^ E P[\ X 3 - 1 ^ Cj ] < € 

if, say, the cj are selected so as to make P[| Xj \ ^ cj] < . Thus, to 

every countable family of r.v.’s we can make correspond a family of 
bounded r.v.’s which differs from the first on an event of arbitrarily 
small pr. Moreover, if we are interested primarily in limit properties 
there is no need for arbitrarily small pr., for the following reasons. 

Let two sequences X n and X' n of r.v.’s be called tail-equivalent if 
they differ a.s. only by a finite number of terms; in other words, if for a.e. 
w C ^ there exists a finite number «(co) such that for n ^ «(w) the two 
sequences X n (o)) and are the same; in symbols P[X n ^ X' n i.o.] 

= 0. If the sequences X n and X’ n only converge on the same event 
up to some null subset, then we say that they are convergence equivalent. 

Let S n = Xk and ^ X'k- Since 

k^\ 1 

CO 00 

P[X n ^ X f n i.o.] = lim n P\J[Xk^ XW ^ lim n E P[X k ^ X f k ] 

k=^n k=n 

it follows that 


a. Equivalence lemma. If the series [ P[X n X^] converges ， 
then the sequences X n and X f n are tail-equivalent and y hence、the series 


6* n S r 

X X n and n are convergence-equivalent and the sequences ~~ and ―- , 

K b n 

where b n | oo^ converge on the same event and to the same limit、excluding a 


null event. 
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17.2 Bounds in terms of variances. To avoid repetitions, we make 

n 

the convention that, unless otherwise stated, S 。 = 0 ，為， 

« = 1 ， 2， • and the summands Xu 义 2 ， ... are independent r.v.’s. 

Let Xi, X 2 , • • •，be integrable. Since centerings do not modify the 
variances, we can assume, when computing variances, that these r.v.’s 
are centered at expectations. Then 

n n n 

= ES n 2 = E EX k 2 - {- E EXjX k = E <r 2 X k , 

k=z\ /, A=1 

since independence of Xj and Xk entails, by 15.2, 

EXjXk = EX r EX k = 0. 

Thus, we obtain the classical 


Bienaym^ equality. If the r.v's X n are independent and integrable, 
then 

<r 2 S n = E 》 X k . 

The basic inequalities 9.3A become 


E 士 U 


k. 


a. 


a.s. sup (S n — ESn)' 


^ P[| S n — ES n I ^ e] ^ 2Z cr 2 Xk. 


€ 2 h ： 


The right-hand side inequality is the celebrated Bienaym^-Tchebichev 
inequality. Applied to (5 n +fc — ^n+fc) — (^n — ES n ) and to S n — ES n 
with € replaced by tb n , it yields, by passage to the limit, 

b. If the series ^ converges，then the series ^ i^n — E,X n ) cort- 

1 n S n — ES n p 

verges in pr. If —- ^ > 0, then - > 0. 

bn b n 


This last property is due to Tchebichev (when b n = n). In the Ber¬ 
noulli case, where b n = », EX n = p, cr 2 X n = pq, it reduces to the Ber¬ 
noulli law of large numbers. It is of some interest to observe that 
Borers strengthening can also be obtained by means of the Bienayme- 
Tchebichev inequality (see Introductory part). 

So far, the assumption of independence was used only to establish 
that the summands were orthogonal 、that is, EXjXk = 0{j ^ k) when 
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Xj and Xk are centered at expectations. In fact, the foregoing results 
remain valid under the even less restrictive assumption of orthogonality 
of Sn—i and X ni n = 1, 2, ...， since, then, 

= o" 2 *S* n _i + O^Xny 

and the Bienayme equality follows by induction. 

But, in the case of independence, the r.v.’s S n -ilA n -i and X n arc 
orthogonal, not only for yf n -i = ^ but also for every event A n —\ de¬ 
fined in terms of X\, X 2 , ••• ， X n _\. Therefore, it is to be expected 
that the foregoing results can be strengthened by using more completely 
the properties of independence, in particular the orthogonality prop¬ 
erty just mentioned. 

A. Kolmogorov inequalities. If the independent r.v.'s Xk are inte- 
grable and the\ Xk \ ^ c finite or not 、 then、for every € > 0 ， 

(«+ 2c) 2 , . 1 n 0 

1 - _ n — .. - ^ P[max - ES k ^ € ] ^ - I> 2 為 . 

n „ fcgn 〆 fc=l 

E A 


If one of the variances is infinite, then the right-hand side inequality 
is trivial and the left-hand side inequality has no content (for, then, 
c = 00 ), so that we assume that all variances are finite. In that case, 
the left-hand side inequality is trivial when c is infinite and therefore 
we assume, in proving this inequality, that, moreover, c is finite. 

Proof. We can assume, without restricting the generality, that the 
X n and hence the S n are centered at expectations, provided we note that 
I ^ c implies I EX\ ^ c and, hence, j X — EX \ ^ 2c. 

Let 

Ah = [max \Sj\ < t], 

iSk 

Bk = ^ k-i — = l| •S'i i < €, … ，丨 Sk-i I < e ， 丨 丨 2 €] 

so that 

n 

Aq = S2, An = Bki Bk CZ [| I < €, I •S'fc I ^ €]. 

1° Since ShI Bk and S n — Sk are orthogonal, it follows that 



^n 2 = E(S n I B y 


=E{S k I Bk ) 2 + E((S n - S k )I B ^) 2 ^ E(S k I Bk ) 2 ^ t 2 PB k . 
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Summing over ^ = 1, … ， w，we obtain 

E 》 X k = ES n 2 ^ f S n 2 = E f S n 2 ^ e 2 E PB k = e 2 PJn C y 

k^sl J^n C = l & = 1 

and the right-hand side inequality is proved. 

2° Since 

Sk—\lAk-\ + Ak-\ = ^k^Ak-i = S 山 * + SklBk 

and Sk-ilAk-i and Xk are orthogonal while IaJbh = 0, it follows that 

-EC'S 1 * ： —l-^Afc_i) 2 4* or 2 Xk' P^k—l = E(Sk^Ai) 2 E(Sk^Bk) 2 - 

Since PJk-i ^ P^n and \ X k \ ^ 2c, and hence 

I SklB k I ^ I Sk-llBk I + I X]Jbic 丨刍 （《 + 
it follows that 

E{S k .J Ak J 2 + <r 2 X k -PJ n ^ E(S k I Ak ) 2 + (« + 2c) 2 PB k . 
Su.mming over ^ = 1, •••,», we obtain 

(E <r 2 X k )PJ n ^ E(S n I A y + (€ 4 - 2c) 2 E PBk 

^ e 2 PJn + (€4 - 2c) 2 PJn° ^ (€4 - 2f) 2 , 

and the left-hand side inequality follows. 

17.3 Convergence and stability. We apply now Kolmogorov in¬ 
equalities and the truncation method to convergence and stability prob¬ 
lems for consecutive sums S n of independent r.v.’s X\, X2, …. 

I. Convergence. In this Chapter, convergence means convergence 
to a finite number or to a finite function (r.v.). 

di. If Yh a 2 X n converges，then ^ 、 X n — EX n ) converges a.s. If ^2 0-2 
diverges and the X n are uniformly bounded, then (X» — EX n ) diverges 
a.s. Thus, if the X n are uniformly bounded、then ^ (X» — E,X n ) con¬ 
verges a.s. if, and only if, ^ converges. 

This follows, by letting — °o in Kolmogorov’s inequalities with 

Sk replaced by <S* m +fc — S m . 

b. If the X n are uniformly bounded and YL X n converges a.s. y then 
2 or 2 X n and Yj EXn converge. 
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Proof. To the r.v.’s X n we associate r.v.’s X' n such that X n and X’ n 
are identically distributed for every rt and Xi, X 2 , X^, ... is a 
sequence of independent r.v.’s. We form the “symmetrized” sequence 
X n a = X n — X' n of independent r.v.’s，and have 

I I ^ I I + I I ^ 2c f EX n a = EX n - EX' n = 0, 

(r 2 z n 8 = <r 2 X n + <r 2 X'n = l<r 2 X n . 

Since 2 X n converges a.s., so does X’ n and hence [ X n s (= [ 

— 2Z X'n)- It follows, by a, that 23 and hence [ cr 2 X n converge 
and, again by a, ^ (X» — EX n ) converges a.s., so that EX n = 
2Z X» — (X» — EX n ) converges. The assertion is proved. 

Let X e be X truncated at (a finite) c > 0. We have Kolmogorov’s 

A. Three-series criterion. The series X n of independent sum¬ 
mands converges a.s. to a r.v. if, and only if, for a fixed f > 0, the three 
series 

(i) E ^[| I ^ 4 (ii) E ^n c , (iii) E EXA 

converge. 

Proof. Convergence of (i) entails, by the equivalence lemma, con¬ 
vergence-equivalence of 2 X n and 22 X» c , and convergence of (ii) and 
(iii) entails, by a, a.s. convergence of ^ Xj. This proves the “if” 
assertion. 

a .s. 

Conversely, let ^ X n converge a.s. so that X n —^ 0. By 16.3A,Cor. 2, 
(i) converges, so that, by the equivalence lemma, Yh X n c converges 
a.s. and, by b, (ii) and (iii) converge. This proves the “only if” asser¬ 
tion. 

Corollary. If at least one of the three series in A does not converge, 
then 2 X n diverges a.s. 

For, by 16.3B (Corollary), X n either converges a.s. or diverges a.s. 

Remark. In the proof of b we introduced a “symmetrized” sequence. 
This is an application of the “symmetrization method,” to be expounded 
in the next section. 

II. A.s. stability. We seek conditions under which - - a n —> 0 

bn 

when b n | °o, and require the following elementary proposition. 
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Toeplitz lemma. Let a n k, k = 1, 2, • • •, k nf be numbers such that 、 

for every fixed k, a nk —> 0 and, for all «, ^ | a nk | ^ c < oo ； /^/ = 

k 

a nk^k* 

k 

Then 、—> 0 entails x f n —> 0 and，if X a nk ^ 1> then x n x 

k 

n 

finite entails x f n —> x. In particular，if co y then x n — x 

" k 酿 i 

• 1 n 
finite entails — ^ a k^k 

b n k 蝉 i 

The proof is immediate. If <v n —» 0 then, for a given € > 0 and 

n ^ n f sufficiently large, | x n | < - so that 

1 5 c 

I 〆 《 I ^ E I a nlt X k I + €. 

k<n € 

Letting w oo and then 6 ― > 0, it follows that x f n —> 0. The second 
assertion follows, since then 

X’n = Z i a nk)x + E a nk (x k - x) X. 
k k 

ajc 

And setting = — < k < n, the particular case is proved. 

b n 

The particular case yields the powerful 

Kronecker lemma. If Yh x n converges to s finite and b n | °o, then 
1 n 

— E h^k 0. 

bn fc—1 

n 

For, setting b Q = 0 y a k = h ~ h-u ^n+i = E ^ky we have 

I n j n j n 

— 2Z h^k = — 2Z h(Sk+l — Sk) = Jn+l — — 2 a kSk s — S = 0. 
bn A=1 b n k=l bn A=*l 

We are now in a position to prove Kolmogorov’s proposition below. 

• cr 2 X n 

A. If the integrable r.v.'s X n are independent, then ~T~T < °°， 
" b n 

Sti — ESn a a. 

^ n t oo, entails -- —^ 0. 

K 
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CT 2 尤 

For, by la, convergence of ^ — entails a.s. convergence of 

yiX n - EX n " 

---， and the Kronecker lemma applies. 

b n 

We can now prove an extension of Borers strong law of large numbers. 

B. Kolmogorov strong law of large numbers. If the independ¬ 
ent r.v.’s X n are identically distributed with a common law £(X), then 
X\ + ... + X n a s. . . .11 

- > cfinite if ， and only tf y E\ X \ < »； and then c = EX. 

n 

Proof. We set = [| ^ n] y Aq = fi, and observe that, for every 

»， PA n = P[| X n I ^ »], while 

Z = Z (« - l)(P^n-l - PAn) ^ J ： E\ X\l An _^ An 

^ E «(P^n-l - PA n ) ^ 1 + Z 


^ 1 + Z P^n- 


c finite, then 


S n Tl 


n n 


n n 


0 and, hence, 


by 16.3a,Cor. 2, ^ PA n < <». This proves the “only if” assertion and it 

c 

remains to prove that, if E\ X\ < then — EX. 

n 

n 

Let E\ X\ <» and set = X 又 k ， where Xk represents Xk trun- 

A:** 1 

cated at k. Since 

ZP[\X n \^n]^J ： PJ n ^E\X\< co, 

it follows that the sequences S n /n and S n /n have same limit, and it 

j as 

suffices to prove that —> EX. Since, by the dominated convergence 

n 

theorem, _ 

EX n = EXI An c EX 


and, hence, by the Toeplitz lemma. 


EX y it suffices to prove 


ISn — E^n 


0. But 


叹 £^n 2 

o~ 一 2^t o~ 


E 2Z ^An c ^ 2 E X < 00, 
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since, setting B m = [ m — \ ^ \ X\ < m ], we have A n e B m = 0 or 

B m according as » < w or » ^ w and, hence, 

?§- w 2 ( i + +.••)’〜 

S (1 + m 2 J —^ I Bm ^ (2 + I X 

so that, by summing over w, we obtain the bound 2 E\ X\. Thus, 
theorem A applies, and the proof is complete. 

*17.4 Generalization. Let c y with or without affixes, be finite posi¬ 
tive numbers and let g n be continuous and nondecreasing functions on 
[0, +»] such that 办 ⑼ = 0 and g n (x) ^ cx 2 or ^ c' according as 0 < 
x < c n or x ^ c n . 

a. If the series (i) I ： P[\ X n | ^ c n ] and (ii) I ： Eg n {\ X n Cn |) converge, 
then X {X n - EXn n ) converges a.s. 

For convergence of (i) entails, by the equivalence lemma, convergence- 
equivalence of X (^n — EXn n ) and d c " — EX n Cn ) and, by la, 
this last series converges a.s., since convergence of (ii) entails 

Z ^ £ £| X n C " I 2 Eg n (\ Z n C » I) < 00. 

b. If the series (i) X Egn(\ Xi |) or (ii) 2Z -P[| Xi | ^ x]^gn(x) 
converges^ then YL (Xi — EX n Cn ) converges a.s. 

For convergence of (i) entails 

ZP[| X n \^C n ]^j f Z Eg n (\ Zn I) < « 

and 

Z Eg n (\ w |) ^ z Eg n (\ X n |) < «, 

so that a applies. 

Similarly, convergence of (ii) entails, by integration by parts, 

00 > X n \^x] dg n {x) = Z ^n(fn)P[| X n I ^ c n ] 

+ f^nCx) dP[\ X U \<X] 

>C'T. P[| X n \^C n ] + J ： ^n(|^n C -|)， 


so that a applies. 
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A. If the series (i) Z Eg n (^) ^ (») P[\ X n \ ^ b n x] dg n {x) 

converges 、 then 

V 一 fV ^nCn 1 托 

X ~ — 7 — ~ — converges a.s. and — ^ H — EXk nCn ) 0. 
bn b n kaB i 

Moreover，if ({) converges and g n (x) ^ c"x for 0 < x ^ c n or for x ^ c ny 
then EX n bnCn can be replaced by 0 or by EX ny respectively .' 

Proof. The first assertion follows from b and the Kronecker lemma. 
As for the second assertion, if and denote summations over 
those values of n for which the first, respectively, the second, assump¬ 
tion about g n holds, then 

1 pbnCn X 

r T E\ X n b ^ I = ZM ~dP[\ Z n I < X] 

bn b n 

7^l! nC Aiy [lXnl<x] 






< 


and 


E" T~ EX n - T." - EX n 


bnCn 


b n 


b n 


^H"-E\X n - X n ^ en \ 

bn 

= E"r f ^P[|^n| <X] 

w 6nCn Ufi 



This completes the proof. 

Particular cases. 1° Let g n (x) = | ^ | r " with 0 < r n ^ 2. Theorem 
A yields 

If b n \<» and 1 V^~'' < % the” KXk - 叫、 

bn A; • 1 

ak = 0 or EXk according as 0 < r n < \ or \ ^ r n 2. 

For r n = 2, we find 17.3IIA. 


0 where 
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2。 Let g n (x) = x 2 for 0 ^ x ^ 1 and, to simplify the writing, set 
9( x ) = 户 [| 义 I 2 X] ， q n {x) = P[| X n I ^ x]. Theorem A yields 

If J* ^{51 9 n{p n x)} dx < oo, then ^ — (X n — EXr^) converges a.s. 
1 n 

and- I ： {X k - EX k hk ) ^ 0. 

K k—l 

We require the following 

Moments lemma. For every r > 0 and x > 0 

l l 

Ar r Z q{» r x) ^E\X\ r ^l+x r J ： q(n r x). 

This follows from 

1 

E\X\ r = - fV^(/) = - Ef nX l t r dq{t) 

0 

and 

l i 

(» - \)x r \q{{n - l) r x) - q{n r x )} 




nrJ x i i 

I i t r dq{f) ^ nx r {q((n - l) r x) - q{n r x)}y 

J(n-\) T x 

by summing the inequalities over » = 1, 2, ... and rearranging the 
terms. 

l 

3° If b n = n r and the laws of the r.v.’s X n are uniformly bounded 
by the law of a r.v. X, that is, q n ^ q y then E\ X\ r < <» entails 

上 q n {n r x)) dx x(J2 q{n r x)) ^ ^ 


so that the right-hand side is finite for r < 2. Therefore, on account of 

2 °, ’ 

, . 1 n a .s. 

If q n ^ q and E\ X\ r < <» with r < 2, then — X — a k) —•^ 0 

- fc— 1 

n r 

where ak = 0 or EX^ according as r < 1, or ^ 1. 

4° If F n — F y then the converse is also true. More precisely (Kol¬ 
mogorov: r = 1; Marcinkiewicz: r ^ 1), 
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Let the independent r.v. 9 s X n be identically distributed with common 
law £(X) y and letO < r < 2. 

1 n 

If E\x\ r <»， then X (-X* — a k ) -^4 0 with a k = 0 or EX 

according as r < \ or r ^ 

1 n a>a 

Conversely, if — T, {X k - a k ) ― > 0, then E\ X\ r < ». 

» ’ *=i 


Proof. The first assertion is a particular case of the preceding propo¬ 
sition. As for the converse proposition, we use the symmetrization 
method expounded in the following section. 

Let X' n be a sequence independent of the sequence X n and with 
same distribution, and let X' be independent of X and with same dis¬ 
tribution; set X n a = X n — X' n and X" = X — X'. Then, on account 
of the assumption. 


Yn 


n llr k 




and, hence. 


厶 8 


n 


Hr 


» llr k 


Yn - 


Z (X k - a k ) - -77, z {X' k - a k ) 


« 1/r ； k= 


f n 


Mr 


a.s. 

Yn—l ^ 0. 


n 


Since the X n s are independent r.v.’s，it follows that, for every x > 0, 

Z q\» 1,r x) = E P[\ x n a I > n^x] < 00 . 

Therefore, by the moments lemma, E\ Jf* | r < » so that, by 17.1 A, 
Corollary 2, 

E\X - yiX\ r ^2E\X a \ r <oo 

and, hence, by the f r -inequality, 

E\ X\ r ^ c r E\X- iiX\ r + f r | M Z| r < oo. 

The proof is complete. 


*§ 18. CONVERGENCE AND STABILITY OF SUMS ； CENTERING AT 
MEDIANS AND SYMMETRIZATION 

While centering at expectations goes back to Bernoulli and use of 
bounds in terms of variances goes back to Tchebichev, centering at 
medians and symmetrization are relatively recent. Yet, not only do 
they complete the first ones, but they also tend to replace them alto- 
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gether. Moreover, medians always exist and the ch.f.’s of symmetrized 
r.v.’s, being real-valued, are much easier to handle than complex-valued 
ones. 

*18.1 Centering at medians and symmetrization. Let F be the d.f. 
of a r.v. X. There exists at least one finite number y.X called a median 
of X y such that 

P[X^nX]^i ^ P[X ^ nX] 

or, equivalently, 

F(nX) ^ i F{nX + 0) g i 

For, F being nondecreasing on R with — = 0, F(+<») = 1 } the 

graph of y = F(x) completed at its discontinuity points by the seg¬ 
ments (x, F(x)) to (x } F(x 4* 0)) has either a point or a segment parallel 
to the x-axis, in common with the line y = According to the fore¬ 
going definition, the abscissae of the common point or of the common 
segment are medians of X so that either X has a unique median or it 
has for medians all points of a closed interval on R — the median seg¬ 
ment of X. 

It follows from the definition of medians that, for every finite number 
c y we can set n{cX) = cy.X. Furthermore, there is a relation between 
nX, EX, and <t 2 X, namely, 

a. IfXis integrable, then \ nX - EX\ S V2?X. 

For, by Tchebichev’s inequality, 

P[\X- EX\ ^ V2?X] ^ i 

so that 

EX - VtTx ^ tiX^EX+ V2^X. 

A r.v. X and its law as well as its d.f. F and ch.f. / are said to be sym¬ 
metric if, for every X， 

(1) P[X ^x] = P[X^ -x ]； 

equivalently, 

⑵ F(-x + 0) = 1 - F(x), 

or, for every pair a < b o{ continuity points of F, 

(3) F[a t b) = F[—b y —a) y 

or 


⑷ 


/ = / is real. 
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The symmetrization procedure consists in assigning to a r.v. X the sym¬ 
metrized r.v. X* = X — X\ where X is independent of X and has the 
same distribution. More generally, X = [X ti t ^_T] is a family of 
r.v.’s, then the symmetrized family is Jf* = — X' / C where 

the family X' is independent of X and has same distribution. If X has 
affixes we affix them to X* as well as to its d.f. and ch.f. Clearly 

b. To a r.v. X with ch.f. /, there corresponds a symmetric r.v. X s = 
X — X' where X and X' are independent and identically distributed ， and 
/• = |/| 2 is the ch.f. of X\ 

We arrive now at inequalities which are the basic reason for centering 
at medians. 

A. Weak symmetrization inequalities. For every c and every a, 

(i) \P[X P[X 9 ^ «] 

and 

(ii) 例 X - nX\ ^ e] ^ P[\ X 9 \ ^ e] ^ 2p\\ X - a \ ^ ^ • 

Proof. Since X* = X — X' where X and X' are independent and 
identically distributed, it follows that to a median /x = nX corresponds 
an equal median n = nX' and 

P[X a ^ C ] = P[(X - M ) ~ {X' - n) ^ P[X - e y X f - n ^ 0] 

=P[Z - M ^ A-P[X' \P[X - M 2 «]. 

This proves inequality (i) which, together with the inequality obtained 
by changing in (i) X into —X, entails the left-hand side inequality in 
(ii). The right-hand side inequality in (ii) follows from the identical 
distribution of X and X' only, by 

P[\ X a \^e] = P[| (X-a)- (X f - a) \^e] 

= 2P\\X - a\^~ - 

L 2 」 

Corollary 1. 1/ X n — a n 0, then X n * 0 and a n — nX n —> 0, 
and conversely. 

This follows by letting » —> » in (ii) where X is replaced by X n . 
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Corollary 2. For r > 0 and every a y 

^E\X - ^X\ r ^E\X a \ r ^ 2c r E \X-a\ r 

where c r = \ or 2 r_1 according as r ^ \ or r ^ X. 

Proof. The right-hand side inequality follows, by the fr-inequality, 
from 


E\X a \ r = E\{X - a) - {X' - a)\ r ^ c r E \ X - a\ r +c r E \ X f - a\ r 
= 2c r E\X-a | r . 

As for the left-hand side inequality, it is trivial when E \ X s \ r = ^> and 
then, according to the inequality just proved (with a = nX) y E \ X — 
M 对 = = oo； thus, we can assume that E\ X* | r is finite. Let 

q{t) = P[|Z- M ^|^/] and /(/) = P[| P I 2 /] 


so that, by A(ii), 

^ 2/(/). 

It follows, upon integrating by parts, that 


E\X~fiX\ r = - f t r d q {t) = f q{t) d(n ^ 2 f q\t) d(t r ) 
Jo ^ 0 Jo 



= 2E\ X 1 | r . 


and the proof is concluded. 

This corollary was used at the end of the preceding section. 

We pass now to symmetrized families and recall that, if two families 
[X t} iCT\ and {X'ty t C T] are independent, then events defined in 
terms of the X t and in terms of the X f ti respectively, are independent. 
We require the following 

c. Lemma for events. Let events with subscript 0 be empty. If, for 
every integer ^ ^ 1, «_! c .. • Aq c and Bj are independent，then 

P\J AjBj ^ aP y A h a = \niPBj. 

More generally、if {Aj -f- + Aj—\) c ' - -(^o + ^o'Y ^ * n ~ 

dependent of Bj and of B'j, then 

PU (4-5/ + A'jB'i) ^ «P U {Aj + A'-), a = inf (PB jy PB',). 
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Proof • The same method applies to both cases. For instance 

P U ^3^) = 4" + P(^i5i) C (/^2^2) C ^3^3 + • • • 

^ PA\B\ ~f- PA\ C A2B2 * 4 * PA\ + • • • 

^ PA \• PB\ -f- PA\ c A 2 *PB2 PA 1 c A2 ^ 3• PBz + ••• 

^ ot(JPA\ 4* T?A\ A 2 4" PA\ c Az ~f* • • •) = otP U Aj, 

B. Symmetrization inequalities. For every e and every aj y j ^ n y 

(i) ip[sup (Xj - nXj) ^ e] ^ P[sup Xj 9 ^ e] 

3 i 

and 

(ii) ip[sup I Zy - nXj IDS P[sup I Xj 9 I ^ e] 

i j 

^ 2P sup \ Xj — a, \ ^ - - 
- i 2 J 

Proof. Since Xj 1 = Xj — X'j and the families \Xj) and \X r j) are 
independent and identically distributed, it follows that to medians 
Hj = nXj correspond equal medians nj = nX'j\ setting 

= l^i — Mi = e'], Bj = [X'j — m ^ 0], Cj = [Xj 8 ^ e'], 

so that AjBj C Cj, the lemma for events applies, with a = and 

This proves (i) by letting e' 丁 e, and (ii) follows by arguments similar 
to those used in the proof of A and by the lemma for events. 

&.S. &.S. 

Corollary. If X n — a n —> 0, then Xn ~ > 0 and a n — y.X n —> 0; 
and conversely. 

By centering sums of independent r.v.’s at suitable medians, we ob¬ 
tain inequalities which can play the role of Kolmogorov’s inequalities. 

C. P. L6vy inequalities. If Xiy • • •, X n are independent r.ti.'s and 

k 

Sk = "22 Xjy then，for every e, 

， • 賴 l 

(i) P[max (Sk — — S n )) ^ e] ^ 2P[«y n ^ e] 

k^n 

and 

(ii) P[max I ^ - n(S k - «U| ^ e] ^ 2P[\ S n \ ^ e]. 

k^n 
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Proof. Let = 0, S*u = max {Sj — n(Sj — S n )) and set 

}Sk 

Ak = [S'-i < e, — n(Sk — S n ) ^ e], 

Bk = [^n — Sk — n(S n — Sk) ^ 0] 
where n(S n — Sk) = —n(Sk — «S*n). Since 

n n 

[«y* n ^ e] = ^ Jky [Sn 2 C] Z) E A k B ki PBk ^ 

^=*1 k=X 

(i) follows upon applying the lemma for events or, directly, by 

n n 

P^n ^ e] ^ PAkPBk ^ 2 S = e]. 

By changing the signs of all r.v.’s which figure in (i) and combining with 
(i), inequality (ii) follows, and the proof is complete. 

Remark. Let X u Xi be independent, square-integrable, and 
centered at expectations. Since, by a, 

I -S n )\^ Vla^iSn - S k ) ^ V2<T 2 S n 

inequality (i) remains valid if n(Sk — S n ) is replaced by —^v2a 2 S n and, 
hence, changing e into e — 2<r 2 6 , „, 

P[max •S '*； ^ e] ^ 2P[6 , „ ^ e — V2o^SV]. 

*18.2 Convergence and stability. We are now in possession of the 
basic tools and shall apply them to the investigation of convergence and 

n 

stability of sums S n = ^ of independent r.v.’s. We recall that 

k 雄 1 

here we say that a sequence of r.v.’s converges a.s. if it converges 
a.s. to a r.v., and their sequence of laws converges if it converges to the 
law of a r.v” that is, converges completely. 

I. Convergence. Whatever be the sequence of r.v.’s, we have the 
comparison table of convergences below : 

convergence a.s. =» convergence in pr. => convergence of laws 

ir 

convergence in q.m. 

(“in q.m.” means “in the 2nd mean” and reads “in quadratic mean”). 
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For series of independent r.v.’s, reverse implications are also true ( 
either with no restriction or under a uniform boundedness restriction 
More precisely 

a. Improved convergence lemma. For series of independent r.v.'j. 

(i) Convergence a.s. and convergence in pr. are equivalent. 

(ii) Ij the summands are uniformly bounded and centered at expectations t 
then convergence a.s. } convergence in pr., convergence in q.m., and con¬ 
vergence oj laws, are equivalent. 

p . 

Proof. 1 0 Let S n —> S, so that, by 6.3A, there exists a subsequence 
S nk S with X) P [|〜 +1 - I ^ ^ < oo. Let n k < n ^ n k+l 
and set Tk = max | 6^ — S nk — n(S n — <S' nk+1 ) j, so that, by P. Levy’s 

n 

inequality (ii), 

k /J L L - 

and, hence, 7\ 上 4 0 as 是一 > Therefore, 

I •S'n — 'S' — —〜 +1 ) I ^ I — S nk — fi(S n — S nk+l ) | + | — 'S' | 

^ T 1 *： + I S nk — S I —> 0, 

that is, S n — fj.(S n — S nk+ ) —> S and, a fortiori y S n — n(S n — S nk x ) 

p P • 

—> S. Since S n —* S, it follows that n(S n — S nk+l ) —* 0 and, hence, 

S n —V S. Thus, convergence in pr. of the series X X n entails its con¬ 
vergence a.s. and, the converse being always true, the first assertion is 
proved. 

2。 Let I X n I $ f < oo and EX n = 0. The series X converges 
in q.m. if, and only if, as w, » — > <» 

E(S m - S n ) 2 = E a 2 X k -> 0 

m+1 

or, equivalently, X o -2 ^ < 00 ； then it converges in pr. and, hence, by 

the first assertion, it converges a.s. But if £(S n ) —*■ £(S), so that for 
all u in some neighborhood of the origin 

-E log |/ n I = - log \/s I < 
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then, by 12.4B’，for u belonging to the intersection of this neighborhood 
with (-1/f,+l/f), 

2Z <^X n = Z o 2 X n a g - 4 Z log |/n(«) I 2 < 

u 

and the second assertion follows. 

The three-series criterion follows from this improved convergence 
lemma exactly as it followed from the convergence lemma in section 16. 

Remark. A better insight into the behavior of the series is provided 
by the Liapounov theorem for the bounded case, according to which, 

n 

if s n 2 = X <r 2 Xk —* °° and ES n = 0, then, for any fixed a > 0 and 

k 3 " 1 

e > 0 and n large enough to have es n > a, we have 

(1) P[\ S n \ ^ a) ^ P[\ S n \ ^ a n ] ~^= f e~ xi/2 dx. 

Thus, as e —> 0, P[| ^ | ^ > 1 for any fixed but arbitrarily large 

a, and the sequence £>{S n ) of laws diverges to a law degenerate at 
infinity. The second assertion follows ab contrario y and we see that when 
the sequence of laws does not converge, then, as » —> <», the distribu¬ 
tion of S n escapes to infinity in the fashion described by (1). 

So far we have been concerned with convergence of a given series. 
Yet various auxiliary centering constants appeared during the investiga¬ 
tion, and the problem arises whether, given the series X of inde¬ 
pendent r.v.’s, there exist centering constants a n such that the series 
X) {X n — a n ) converges. If X (X n — a n ) converges a.s. for some nu¬ 
merical constants a n , we say that the series X is essentially conver¬ 
gent; otherwise, we say that it is essentially divergent 、 since, then, by 
the corollary of the zero-one law, X d — «n) diverges a.s. whatever 
be the a n . As above, our problem is to find criteria for this dichotomy 
and to find the suitable centering constants when the series is essentially 
convergent; at the same time, we shall be able to improve the preceding 
results (see also 37.1). 

b. Essential convergence lemma. The series X X n is essentially 
convergent if, and only ij、the symmetrized series X XJ converges a.s. 

Proof. If X Xn converges a.s., then, for every finite c > 0, using 
17.1 A, by the three series criterion, 

E 尸 [I Xn - M x n I kc]^Z 2 P[\ X n a \^c] 

and, upon integrating by parts, 

AXn - nX n y ^ E <r 2 (X n r + r 2 Z P[\ X n 3 \ ^ c] < oo. 
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Therefore, the series Z [X n - nX n - E(X n - nX n ) e ] converges a.s. 
and the if* assertion is proved while the “only iP’ assertion is im¬ 
mediate. 

From this proof follows the 

A. Two-series criterion. The series 22 X n is essentially convergent 

if, and only ijjor some arbitrarily fixed f > 0, the two series X) P[| X n ^ - 
mXi I ^ c] and YL a\X n — nX n ) e converge; then the centered series 
X {Xi — mXi — E(X n — converges a.s. 

The essential convergence lemma permits us to improve further the 
convergence lemma. 

B. Equivalence theorem. For series of independent r.v*s y conver¬ 
gence of laws, convergence in pr. and a.s. convergence are equivalent. 

Proof. It suffices to prove that convergence of laws implies a.s. con¬ 
vergence. Let/n be the ch.f. of X n so that |/ n | 2 is ch.f. of X n a . If 

n n 

TUk —* / ch.f., then II \fk \ 2 —* |/| 2 and, by 13.4 5^ the two series 

P[| X n a I ^ f] and <r 2 (X n s ) e converge. Since E(X n a ) e = 0, it fol¬ 
lows, by the three series criterion, that the symmetrized series X) X n * 
converges a.s. Therefore, by the essential convergence lemma, there 
exist constants a n such that the series X d — a n ) converges a.s. to 
a r.v. and a fortiori its law converges completely, so that, for every «, 

n 

E [ 厂 —《/'(“)， where/' is a ch.f. By taking u close enough to 

A：» 1 

0 so that/(«)/’( 《 )〆 0， it follows that the series converges and, 
hence, the series X X n converges a.s. This completes the proof. 

Corollary 1. A series ^ X n of independent r.v. 9 s converges a.s. if y 

n 

and only if，TlJk —* j andJ is continuous at the origin or J # Q on a set 

k 載 1 

oj positive Lebesgue measure. 

This follows by the continuity theorem or 12.4, 4°. 

Corollary 2. A series ^ X n of independent r.v.’s is essentially con¬ 
vergent or divergent according as 

n 

lim IX I /fc I 7^0 on a set oj positive Lebesgue measure or 

n 

limn|/fc|=0 a.e. 

kwm\ 

This follows by 13.4, 4° and b. 
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II. Stability. Given sequences a n and b n | <», we seek conditions 
for a.s. stability of sequences S n of sums of independent r.v.’s. On ac¬ 
count of the corollary to the symmetrization lemma, a first condition is 

that a n = n + 0 ⑴ . Thus, it suffices to take a n = n and 

investigate conditions under which ^ — /x 

We have 彡 n | and, moreover, assume that there exists a subsequence 

b nk and finite numbers c, c' such that, for all k sufficiently large, 1 < 〆 芸 

^ r < oo. Roughly speaking, this assumption means that the se- 

^nk . . . # 

quence b n does not increase too fast, and it is always satisfied (with an 
—. 、 1、… ^±1 一 1 . Let l 。 = 0 and T k = Snk — -"* -1 


arbitrary f > 1) when 


A. A.S. STABILITY CRITERION, (l) ^ — H ( 令 ) ^ z 7> an ^ on b 
(ii) T k — uTk 0 as k oo or t equivalently^ (ii') jor every -e > 0, 

Z P{\ Tk - I ^ c] < oo. 

Proof. Since the Tk are nonoverlapping sums of independent r.v.’s, 
it follows, by 16.3A, that conditions (ii) and (ii') are equivalent. And, on 
account of the symmetrization lemma, it suffices to prove equivalence 
of (i) and (ii) for symmetric summands; then the medians which figure 
in these conditions vanish. 

If ^ 0, then 

bn 

S nlr a.s. at rr, — Snjfc-i ^ nk Kh-i ^n k -i as ； A 

■^ 一 — 0 as 是一 > « 3 , Tk = - - - = "7 - —r~ 7 - ^ 

b nk b nk b nh b nk c^nk^i 

and the “only if” assertion is proved. 

Conversely, if —> 0, then, by the Toeplitz lemma, 




•S*n ^nk-i 


• f 7* I ^ fl ^ 7ljb —1 

Furthermore, upon setting Uk = max t 

n*_i <n gn* 夕 n* 

P. Levy*s inequality we obtain, for every e > 0, 

Z P[U k ^ C ] ^ 2 z ^[| n I ^ c] < oo, 


and applying 
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a.s. 

so that Uk —> 0. Therefore, for n k _i < n ^ 


S n 



Tn 




bn 


〜 : 1 + 、 


K 


^T： U ^t 


^ n k-i 

^n k _ l 


^cU k + 0, 

b n k 

and the “if” assertion is proved. 

Corollary 1. I/\X n \ < b ny then — ~ ESn 0 and only if, 

bn 

a.s. 

Tk — ETk — > Q as k — co or ， equivalently、jor every € > 0 ， 


E 尸 [I T k - ET k I ^ e] < oo. 

k 


Proof. The “only if*’ assertion is proved as that of the foregoing 
criterion. As for the “if” assertion, set X n k = (X n — EX n )/b nki »k-i < 
w < so that X ^nk = Tk — ETk Q. Note that | X n k \ < 2 and 

n 

apply 13.5, 3° and 18.1a. It follows that 


so 

a.s. 


\^T k -ET k \^ V 24 2 -> 0, 

that Tk — uTk —> 0 and, by the foregoing criterion, — /x ( 旁 ) 


0. But 


^ J2 


Sn 


b n 


0, (J n 2 = <r 2 S n ), 


4M3 

since, for n^i 

?6 - k 2 = i4s ，i2 ^ °- 

Therefore, —> 0, and the proof is concluded. 

Corollary 2. If the X n are centered at expectations and X 

then 令 0. 

bn 


fXn 


< 00 , 
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Let X n be X n truncated at b ni and set "S n = ~— 1 . 

fc-i b nk 

Since, by Tchebichev’s inequality, 


Z P[\ X n ^x n ] = z P[| x n \^K]^T, 


<r 2 X n 

U <00 , 


it follows, by the equivalence lemma, that the sequences 》 and 会 are 

_ 一 彡 n 彡 n 

tail-equivalent. But ES n /b n —* 0 since | EXk — EXk \ < <r 2 Xk/h while 


^P[\ E Z Z 

n>njk—i Onk n>n*-i On n>nk-\ "n 


n>n*-i 


n>nk-\ 


so that 




Corollary 1 applies, —> 0 and, therefore, 0. 

bn b n 


*§ 19 . EXPONENTIAL BOUNDS AND NORMED SUMS 

In this section, the r.v/s X ny n = 1, 2, .. .，are independent and cen¬ 
tered at expectations with variance <r n 2 = <r 2 X n = EX n 2 ; and S n = 

n 

X Xk are their consecutive sums, so that ES n = 0, s n 2 = a 2 S n = 

ifc-i 

n 

X <Tk 2 . We exclude the trivial case of degenerate summands. 

1 

19.1 Exponential bounds. Kolmogorov’s inequalities led, in Section 
17, to asymptotic properties of sums S n . His inequalities below, where 
to simplify the writing we drop the subscript n y will lead to deeper re¬ 
sults but under more restrictive assumptions. 

… . I X k 

A. Exponential bounds. Let c = max —— and let e > 0. 

hSn I S 

(i) IJ tc ^ \ y then P - > e < exp ^ — ^ 1 — ^ and, if tc ^ 1, 

fS 1 f c 

then P — > e < exp - - 

- s 」 L 4c. 

(ii) Given y > 0, if c = c(y) is sufficiently small and t = e(y) is suf¬ 


ficiently large，then P - 

! 


> e > exp 


(1+7) 
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Proof. 1° Let / > 0, | | ^^<00, EX = 0 and = a 2 X. Since 

I EX n I ^ c n , Ee tx = 1 +-£Z 2 + -£Z 3 +..., 

2! 3! 

^ (1 ~° < 1 + / < 〆 ， 

it follows that, for tc ^ 1, 




< exp 


2 

7V / 


3 3.4 


< 1 


/V / tc 、 

( 1 + 2 > 


2 


and 


.. 2 


. tc 

l 1 — 


2/J 




/V/ 

>1 + -r0 - 


tc 、 


_ .y , / 2 <r 2 / tc t 2 c 2 

Ec > 1 + - (1 ———— 

2 V 3 3.4 

「/V 

> exp |^— (1 - tc) 

. X k . S 

Replacing X by — , setting S f = -, and taking into account that 
s s 


we obtain 


Ee tS， = IX 五 exp 


tx u 



r / 2 1 


r / 2 / /ai 

(1) exp 

](1 - 幻 

< Ee tS， < exp 



tc ^ 


Inequalities (i) follow then from 


> c] ^ €~ u Ee tS， < exp -U + ^( ! +^) 


where t is replaced by € or - according as ^ 1 pr ^1. 

* c ' 

2° The proof of inequality (ii) is much more involved. Let a and 
)8 be two positive numbers less than 1; they will be selected later in 
terms of the given number 7. According to (1), we can take c suffi¬ 
ciently small (<a//) so as to have 


( 2 ) 


Ee tS， > exp 


r / 2 

L2 (1 ~ a) . 
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On the other hand, setting q{x) = PfiS 1 ' > x] and integrating by parts, 
we have 

Ee ts， = — J*e tx = tje ix q{x) dx. 

We decompose the interval ( —oo } + 00 ) of integration into the five inter¬ 
vals /! = (-«, 0], I 2 = (0, /(I - /3)], I z = (/(l - /3), /(I + m, 
I\ = (/(l + 0 )i 8 /] and Is = ( 8 /, +°°) and search for upper bounds of 
the integral over Ii and Is and over / 2 and /*. We have 

Ji^tf e tx g(x) dx <tf e tx dx = 1 . 

—00 ^ *—00 

On account of (i), we have on / 5 , for 8/r < 1, 

1 

< exp [— 2 /x] for x ^ - 


q{x) < exp 


qix) < ex P [ - £ 

f-?( 1_ T)]- exp 


< exp[ — 2 /x] for x < 


Therefore, for c sufficiently small (< 1/8/) 


and 

(3) 


Js = / I e tx q{x)dx < / / e~ tx dx < 
ht .ht 


/i + /<5 < 2. 


On the intervals I 2 and I 4 we have x < — for c sufficiently small and, 
by (i), 2 C 


e tx q(x) < exp 


tx 




芸 exp 


tx — — 4/r) 




The quadratic expression g(x) attains its maximum for x = -- — 

which, for c < j8/4/(l + j 8 ), lies in 1%. Therefore, for c sufficiently small 
and x C hi 

t 2 t 2 ( p 2 \ 

g(x) ^ g(t(l -/3)) =-(1 -)8)0 + /3 + 4/r — 4/f/3) <-(^1 

and, then, 

p t (1 —/ 3 ) 「/ 2 / 1 、 

/ 2 = /J e tx q{x) dx < tj e g(x) dx < t 2 exp |^— ^1 — -j 8 2 ^ 
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similarly, 

A = f f e tx q{_x) Jx < tf ^ (x) dx < 8/ 2 exp - f 1 - - / 3 2> \ - 
J <(i+/3) L 2 \ 2 /. 

/3 2 c 

We set now a = — and / = - so that, by ( 2 ), 

4 1-卢 7 * 


(4) /2 + /4 < 9/ 2 exp (i -; 沪 ) 

9c 2 「 e 2 /8 2 1 f c 


Since the last expectation and the inverse of .its coefficient increase in¬ 
definitely as e —> oo, it follows, by (3) and (4)，that for 6 sufficiently large 

/i+/ 5 <2< \Ee tS \ J 2 + J A < \Ee tS， . 


Then 


a fortiori^ 


• Hl+0) 


<(i — p) 


ix g(x) dx > \Ee ts \ 


2iW Hl+P) q(f) > -exp — (1 — a) 

2 隱 2 _ 

」 . 1 f/ 2 1 . . 

and, since as e —> oo, — exp — a —*■ <», replacing / by its value, it 

follows that, for e sufficiently large, 

… > 4 ^ CXP [^«] ex p[-^(l + 2a + 2/3) 

>e 十 2 J 


「 c 2 1 + 2« + 2^1 

>eXP [ _ I~(l-^ J 

But, given 7 > 0, we can select /8 > 0 so as to have 


/3 2 

1 + 2/3 + - 

2 

(u 


^1+7- 


Therefore, for c = ^( 7 ) sufficiently small and « = «( 7 ) sufficiently large, 


q{i) > exp 


(1+7) 


and (ii) is proved. 
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*19.2 Stability. The a.s. stability criterion (which is due to Prok¬ 
horov for b n = ») is a criterion in the sense that it is both necessary 
and sufficient. Yet, it is not satisfactory, since, because of the independ¬ 
ence of the summands, it has to be expected that a satisfactory criterion 
ought to be expressed in terms of individual summands and not in terms 
of nonoverlapping sums. The nearest to this requirement is a criterion 
in terms of variances (due also to Prokhorov for b n = »), valid when 
the summands are suitably bounded, and whose proof is based upon the 
exponential bounds. 


Let 彡 „ T °°， 0 < 5 


,< 

~ K 


^ c < °o and set Tk 


S n k _ S nk _' 


(T 2 T k = z <r 2 Xfc. We write log 2 for loglog. 

b ni r <n <nk 

I X n I S n as. 

A. If - = o(log 2 _1 in) then - : > 0 if, and only if, for every 

bn b n 


e > 0, the series (i) X exp 


f e 2 l 


converges. 


Proof. For n sufficiently large — < 1, so that corollary 1 of the 
a.s. stability criterion applies: for every 6 > 0 

(ii) EP[| n I > c ] < 00 . 

We have to prove that convergence of series (i) for some 6 implies that 
of series (ii) for the same or distinct 6 ； and conversely. On the other 

I -Xn I 

hand, elementary computations show that, setting = max —-—， 

njfc-i <n ^njfc O n 

the assumption made implies that Ck = with —> 0 as 是一 > 00 . 

We use now the upper exponential bounds and observe that for 

f* —5 ^ 1 and k sufficiently large 
tk 

P[| T k \>,] < 2exp [- 云 bgA] = 2 0 4 "<| 

and 2 

eXP [ ~ - eXP [- £；] = eXP [- ia k log ^] < I 2 
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so that the corresponding sums in (i) and (ii) converge and we can 

neglect all those terms for which f * A ^ 1 . 

tk 


Since for c* —< 1 
tk 2 

P[| Tfc I > e] < 2 exp 


e 2 


Cki 

W)\ 


< 2 exp 


e 2 

W 


it follows that convergence of series (i) for every e > 0 entails that of 
series (ii) for every e > 0 . Conversely, if series (ii) converges, then 

Tk~^0 and /* 2 —> 0 , so that, for k sufficiently large, — is as large as we 

r tk 

please and 7 ^ < -- is as small as we please. Therefore, the exponential 
tk e 

bound is valid with, say, 7 = 1 , and 

r e 2 - 

戶 [I T k \ > e]> 2 exp - - » 

- tk - 


so that convergence of series (ii) for every e > 0 entails that of series 
(i) for every e > 0 , and the proof is concluded. 


Corollary. 


Ifjor an 1 , J2 


E\ X n ] 2r 

n r+l 


< 00 , then 




n 


0. 


For r = 1, this proposition coincides with Corollary 2 of the a.s. stabil¬ 
ity criterion, so that it suffices to consider the case r > 1 (due to Brunk) • 

r+l r+1 

Proof. Let X n = X n or 0 according as | | < « 2r or g « 2r ， so 

that 

n 


0{\o %2 ~ x n\ £ 


E\ X n | 2r 


n 


r+l 




E\ X n | 2r 


n 


r+l 


<oo 



and, by Tchebichev’s inequality, 

_ - ±1 El X l 2r 

E P[Xn ^x n ] = Y ： P[\ 厶 I 2 « 2 1 ^ E 1 <■ 00 , 

n 

Therefore, on account of the equivalence lemma, it suffices to prove 
that the assertion holds for r.v.’s X n which satisfy the assumption made 
in A. 
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But, upon applying for r > 1 the inequality E r \ X\ ^ E\ X \\setting 
w *； = 2 fc , and applying the f r -inequality with rt k - n k _ x summands, we 
have, summing over n = n k _ x + 1, • • •, n ki 




Z E\ x n | 2r g E 


E\ X n | 2r 


Therefore, 


E4 2r ^ E 


E\ X n p 


and, since we have exp 


< 4 2r for k sufficiently large, criterion 


A is satisfied, and the proof is concluded. 

*19.3 Law of the iterated logarithm. We say that a numerical se¬ 
quence b n belongs to the upper class or to the lower class of a sequence 
S n of r.v.’s, according as > b n i.o.] = 0 or 1. A priori, there may¬ 
be sequences b n which belong to neither of these two classes. However, 
if S n is an essentially divergent sequence of consecutive sums of inde¬ 
pendent r.v.’s, then every sequence b n belongs to one of the foregoing 
two classes. The problem which arises is that of corresponding criteria. 
Relatively little is known about its general solution (in the case of un¬ 
bounded summands), and the proofs of what is known are quite in¬ 
volved; the best results are due to Feller. The basic known result was 
first obtained by Khintchine (also P. Levy) in the Bernoulli case as a 
strengthening of consecutive improvements of Borers strong law of 
large numbers and, then, was extended by Kolmogoroff (also Cantelli) 
to more general cases, as follows: • 

A. Law of the iterated logarithm. If 

Sn —> oo and = o(log 2 -H J n 2 )> U = (2 log 2 s n 2 ) 


P Hm sup 


1 = 


In other words, for every 5 > 0, the sequence (1 + 5)^ n / n belongs to 
the upper class of the sequence S n while the sequence (1 — 8 )s n t n be¬ 
longs to the lower class; clearly, it suffices to prove these assertions for 
5 arbitrarily small. 
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We observe that, since the assumptions remain valid if every X n is 
replaced by —X n , the conclusion yields 


P lim inf ■ 


-1 


and, therefore, it holds for both sequences and | «S*» | if it holds for 
the first one. 

, 2 


Proof. Since s n 2 —> « and 


1 + o(log 2 —1 s n 2 ) 1, it fol¬ 


lows that, for every f > 1， there exists a sequence = n^c) |« as 
k °o, such that s nk ~ f*. Let 5, 5" be positive numbers. 

1° We prove that the sequences (1 + 8)s n t n belong to the upper 
class of the sequence S n by proving the same for the sequence S* fc = 
max ^ n . For 

n^nk 

P[Sn > (1 + B)s n tni.o.] ^ P[S* k > (1 + 8)s nk _ 1 


where 


1 + 5 

(1 + 〜 : 

C 


hence, taking 5' < 5, we can select r > 1 so that 


1+5 


>1+5' and 


尸 [M* > (1 + t nk _ x i.o.] ^ > (1 + i.o.]. 

Thus, the assertion will follow from the Cantelli lemma if we prove that 

2Z > (1 + 5’)j nfc / nfc ] < 00. 

But, by the remark at the end of 18.1，the general term of this series 


is bounded by IP S nk > ( 1 + 5' 


V~2\ ■ 

f J 


where 1 + 5' 


1 + 5' Therefore, for 5" < 5' and 是 su 伍 ciently large, 


P\S nk >(l + 8 f 


V2\ - 

一 — J S ^Jn k 


^ P[Sn k > (1 + nWn k ]y 


and it suffices to prove that the right-hand side is general term of a 
convergent series. This follows by applying the first upper exponential 
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bound with e *； = (1 + 8 ’’)t nie and Ck = max | Xj \ /s nki valid Yor k suf¬ 
ficiently large since c^n k —* 0, so that 

P[S nk > (1 + 刍 exp [-1(1 - W2)(l + S") 2 /„ t 2 ] 

一 HI + S ") lo g2 O ’ 

and the assertion is proved. Furthermore, according to the considera¬ 
tions which follow the statement of the theorem, this assertion entails 
that P[| I > (1 + 5) s n t n i.o.] = 0. 

2° It remains to prove that the sequences (1 — B f )s n t n belong to the 
lower class of the sequence S n where we will take 1 > 3' > 5. This as¬ 
sertion will be a fortiori true if we prove that it holds for a sequence S nk . 
Let 


2 


2 


2 


2 


w - w 〜 wy ~t)' 

Vk = (2 log 2 〜（2 log 2 Sn k 2 ) = tn k 


and set 


Ak = [S nk — ^n fc -i > C 1 — 

We prove first that P[Ak i.o.] = 1, as follows: The sums S nk — 
being nonoverlapping sums of independent r.v.’s, are independent and, 
by the Borel criterion, it suffices to prove that Yh = °°- But, 
= (1 — 8)v k —^ 00 while= max (| X n |/«*) —^0as ^ ® ;hence 

tlfc—1 

the lower exponential bound for PAk applies with 1 + 7 = y 
Therefore, 

PAk > exp [—J(1 + 7)(1 — ^) 2 Vk 2 ] = exp [ —(1 — 5) log 2 u k 2 ] 


8 


(2k log c) 


1—3 


the series XI P^k diverges, and P[Ak i.o.] = 1. 

On the other hand, if B k = l| | ^ then, according, 

to the end of 1°, P[B k c i.o.] = 0; thus, from some value n = n(u) on 
l^n fc _i( w )l ^ except for w belonging to the null event [Bk i.o.]. 

Therefore, P\AkBk i.o.] = 1, and this entails the assertion. For, 
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一 S)u k Vk — 


■^kBk c [•S'nfc > (1 — 5 )«咖 一 / 

f / 1\ K * 2) 

.fi. 一 Oe t ^ I 1 _ 1 _ __ y 


a - s) 


^ nkfnk 


and, if we take c sufficiently large so that for 8 f > 




- >1 一 8’， 

c 


then 


1 = P [AkBk i.o.] S P [*y« 4 > (1 一 S’) 』 ”〆”* i-o.]. 
The proof is terminated- 


COMPLEMENTS AND DETAILS 

n 

As throughout this chapter, S n = ^ Xk and a.s. convergence is to a r.v. 

i 

/. If the chi of a sum of two r.v.’s is the product of the ch.f/s of the sum¬ 
mands, the summands may not be independent. Construct examples. Here 
is one: -XT is a Cauchy r.v. — with ch.f. consider X + Y where Y = cX 、 

c>0. 

2. Let X y Y be independent r.v.’s and let r ^ 1. 

If X and Y are centered at expectations, then E\X + Y\ r majorizes E\ X | r 
and E\ Y\ r . More generally, if, say, A is an event defined on X f then 

E\X+Y\ r I A ^E\X\ r I A . * 

IfE[ X+Y\ r is finite, so areand £| Y\ r . (Since \x\ r = \ E(x + Y) | r 
彡 E\x + Y | r , it follows that 

E\X+Y\ r I A =jdF x {x) [j\x+ y ydF Y {y)\ \ r dF x {x) = E\X\ r I A . 

For r > 1 the first assertion implies the second one. For r = 1, set A = 
[| X I < a] and observe that E\ X + Y\^ E(] Y j — o)Ia — (E\Y \ — a)PA) 

3. Generalized Kolmogorov inequality• Let X\ y • • • be independent r.v/s 
centered at expectations, and let r ^ 1. Set C = [ sup | ^ | ^ r] and prove 

k^n 

that 

C r PC^E\ Sn\ r Ic^E\ 

Apply to the same problems to which Kolmogorov’s inequality was applied. 
For example, if S n S 9 then S n S. (Set Ck = [sup | 1 < r, | | ^ c] f 

So = 0. By E\ S n \-Ic = j ： E\ S n \^I Ck ^ ^ \ r Ic k ^ c r PC .) - 

4. Let • • • be independent r.v.’s, and let T n r = sup | Sk | r , r ^ 1. 

If the Xk are symmetric, then ET n r ^ 2E\ S n | r . ^ 

If the Xk are centered at expectations, then ET n r 2 2r+1 £| S n \ r * 
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Extend to » = « when S n —> S v . (If symmetric, then 

ETn r = C P[T n r ^t]dt = 2E\ 5 n |^. 

•^0 

If centered at expectations symmetrize; then 

\s k \ r ^ 2r_1 ^p|^ -^l r + 2r_1 l s， k l r . 

Integrate over X\ f • • •, X f n9 take sup, integrate over • • •, X ny and apply 
the first assertion.) 

5. Let: • • • be independent r.v.’s centered at expectations, and let 

r ^ 1. If 2 E\ X n | 2r /» r +i < oo, then — 上 ％ * ()• (Apply 4 and the elementary 

inequality (XI <*fc 2 ) r ^ » r_1 YL I a k | 2r to obtain E\ S n | 2r ^ f» r_1 D £| X* | 2r . 

kml k 職 1 

By Tchebichev’s inequality, # +1 

P[\ 如 — A* 丨 g 2〜1 $ c2 r ^ 2r E E\Xi | 2 V/ +1 . 

Apply the a.s. stability criterion with tik = 2 k .) 

6. The series ^ c n e iBn where the 6 n are independent r.v/s with Ee i$n = 0, 
converges or diverges a.s., according as the series ^ c n 2 converges or diverges. 

7. If a series 2 of independent r.v/s converges a.s., then by centering 

the summands at the terms of some convergent series, the a.s. convergence and 
the limit are preserved under all changes of the order of the summands. (Start 
with a series which converges in q.m. Use the centering in the two series 
criterion.)* . 

8. A series of independent r.v.'s with ch.f/s/ n converges a.s. whatever 

be the order of summands if, and only if, Z) 1/n — 1 1 < 

P. If a series X) of independent r.v.’s is essentially divergent, then it 
degenerates at infinity: P[\ *S* n | < r] — 0 however large be r > 0. State the 
dual form for essential convergence. (This is true for the symmetrized series. 
Prove and apply: if X and X f are independent and identically distributed, then 
P 2 [\X\<c]^P[\X-X f \ < 2c).) . , 

10. Let X) -STn be a series of independent r.v.’s with ch.f/s/ n . 

If for a subsequence of integers m — 的 there exist r.v/s Y m with ch.f. g m 

such that S m and Y m — S m are independent and || 2 一 ^ | S | 2 continuous at 

the origin, then X) X n is essentially convergent. (This follows from 
m _ # 

II I /* I ^ \gm\ —> |^| > € > 0 in a neighborhood of the origin.) 

11. Smoothing by addition. Loosely speaking, a sum of independent r.v.’s 
is at least as “smooth” as any of its summands. More precisely, continuity or 
analyticity properties of the law of one of the summands continue to hold for 
the law of the sum. Examples: 

(a) If one of the summands has a continuous law so does the sum. (Intro¬ 
duce the “concentration” Cy defined by Cx(l) = max P[^ ^ X ^ x /] f 

xCR 

1^0. Observe that Cx(0) = 0 if, and only if, Fx is continuous. By the com¬ 
position theorem for independent r.v/s X and Y y Cx-^Y ^ Cx, Cx+y ^ Cy>) 
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(b) If one of the summands has an absolutely continuous law, so does the 
sum. (In defining the concentration replace translates of segments of length / 
by translates of Lebesgue sets of measure /.) 

(c) If one of the summands has a strictly increasing d.f., so does the sum. 
What about unicity of medians? 

12. The symmetrization method reduces medians to zero and transforms 
essentially convergent series into a.s. convergent ones. However, only cen- 

oo * 

tering at medians does not yield a.s. convergence. In fact, let 23 be an 

fi*»0 

a.s. convergent series of independent summands. The sequence /4(5 n ) of me- 

00 

dians may not converge. However, if X) -STn is essentially convergent and the 

0 

r.v. Y Is independent of all the X n and has a strictly increasing d.f., then, after 
centering the 6* n + y at medians, the series converges a.s. 

(For the counterexample, take Xo = — 1 or +1 with same pr. 1/2; let 
0 < /> n < 1 with 53/> n < 00 and, for w ^ 1, take and with values 

2( —l) n of pr. p n and 0 of pr. 1 — p n . The sequence S n converges a.s., yet the 
S n are odd integers with ^ 1 and fjL(Si n ^ ^ 1. For the last assertion 

use 11(c).) 

• Sri^ 

13. The X n are not assumed to be independent. If — > U and the X n 


are uniformly bounded, then — U. What if n 2 is replaced by n k where 

是 is a fixed integer? What if n 2 is replaced by [q n ] with ^ > 1 arbitrarily close 
to 1? 


More generally, let ^ P[\ U n — U \ > €]/w* < «> for every € > 0, 
X) P[\ I > cn fi \ < 00 for some c > 0 } 0 < a ^ 1, 0 > 0. If 7 ^ 


then U n U y where U n = S n /n y 

(For the first assertion, the second part of the proof of Borers strong law of 
large numbers (see Introductory Part) applies. For the second assertion, use 
the following property of series: if 13 I |/” a < 00 0 < a ^ 1, then 

JL\pn k \ < 00 (or 7t k+1 - n k = o(n k ^)). 

k 


In what follows, the r.v. y s X\ y are independent and identically dis¬ 

tributed with common d.f. F y and ch.f./ of a r.v. X; the trivial case of ^ = 0 a.s. 
is excluded. In other words, repeated trials are performed on X. 

14. Random selection. Let 〜< M < •. • be integer-valued r.v.’s such that 
every \vj = n\ is defined on X\^ • • •, X n ^.\. The r.v.’s X Vu • • •, are inde¬ 
pendent and identically distributed — as X. (Proceed as in 


P[X n < x u X vt < x 2 ] = Z P[n = wi, X ni < xi ； v 2 = X nt < ^2] 

= H P[v\. = »i, X ni < V2 = rt^PiX^ < X2] 

iSm <»*<«» * 

= P[Xi < Xl]P[X 2 < x 2 ].) 
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15. Deviations from the median. If X is centered at a median, then 

气 i: X k \> g ^j ： E\X k l g{2n + 1) = g{2n + 2) = 
k^\ n kaml {2 n n } y 

This inequality is not necessarily true when X is centered at its expectation. 
Extend to nonidentically distributed XkS. (Divide R n into its 2 n “octants” 
and consider the corresponding parts of the left-hand side. For a counter¬ 
example, take n = X = l with pr. 2/3 and —2 with pr. 1/3.) 

16. Equidistribution of sums. If X is a lattice r.v. with step h — only possible 

• 1 + n 

values kh y 是 = 0, 土 1 ， •• — set M(g) = lim -—— —X) and otherwise 

n ♦ qo *r* 1 糸期 一 ii 


. 1 r +A • 

set M{g) = lim — I g(x) dx for those functions g on R for which either of the 
h In J 一 h 

foregoing limits exists and is finite. 


(a) In the first case M{e iux ) = 1 or 0, according as « = 0 (mod f) or “〆 0 

’ 27T\ 

mod —J . In the second case M(e iuz ) = 1 or 0 according as « = C or « 〆 0. 


(b) For every u d 

1 n & s 

y n = - z ^ ^ M{e iux ). 

n fc-i 

(This is immediate in the lattice case and \( u = 0. Otherwise f(u) 9 ^ 1 and 


E \ Y ^' = \ + 

n n 心 j>k n 

where c is finite. Use 13.) 

• . 1 ^ u.s. 

(c) The family G of functions g on R such that - X i(^k) — > M(g) con- 

tains all almost periodic functions and functions with period p Riemanri- 
integrable on [o y p]. {G contains all functions g(x) = e iux . It is closed under 
additions, multiplications by complex numbers, conjugations, and uniform pas¬ 
sages to the limit. M is a linear monotone operation on G.) 

If gn^lG and g n ^ g y then M(g n ) —> M{g). If g ff n C G and M(g f n ) - 
M(g f, n) —> 0, then for every g such that g f n ^ g S g f， n whatever be n y g G 
and M(g) = 15m M(g f n ) = lim M(g ,f n ). 

(d) For X degenerate at an irrational a y the classical equidistribution (modulo 
1) of the fractional parts of na follows: for g bounded with g(x) — c finite as 
x — 土 00 , 

1 

- Z g(^k) —> c. 
w k^i 

For every finite segment /, (no. of <S*i, • • •, in I)/n —> 0. 

17. Normal r.vJs. Let X be normal with EX = 0, EX 2 = 1, let g on R n be 
a finite Borel function, and set X = S n /n. 

(a) If + Cy •••,〜 + ()= g(xiy • • •, ^n) for all x k , c R y then the ch.f. 
of the pair X, g(X it •••, is/(«, v) v) where/i(«)= 厂 “ * /2 is 
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ch.f. of X and j^{u y v) = (27r)^ n/2 JA(x^ • • •, ^ n ) dx\ y • • •, dx n with 

log k{x h …, X„) = —备 $ — $) + iog(.X h • • •, Xn). 

(b) If /2 is analytic in u y then X and g{X\ y • • •, Xn) are independent. In par¬ 
ticular, X is independent of max | Xj — Xk I and of ^2 \ Xk — X\ r ^ r > 0. 

j t k k 篇 1 

(/ 2 is independent of u: set u = inc and use the translation property of g) 

(c) Let p with or without affixes denote a pr. density with respect to the 

Lebesgue measure* Let p(x) = - y= exp [—(x — a) 2 /2P] be the pr. density 

by/lir 

of Xk 、 and set 

炉 = 丄 X) (JG — 叉 ) 2 , ^ = S/y/n y Y — — a), Z = - 


Then the pr. density of Y is 


y/lir 


厂 xV 2 , 如 pr. density of Z converges to 


― \= e~^ /2 y the pr. density of {Y y Z) converges to ― 尸 = 厂 (a;2 + v2)/2 , and 
V27T V27T 

❿ “(1 + oQ ), ^ = ^( 1 + ° Q )- 






CENTRAL LIMIT PROBLEM 


The Central Limit Problem of probability theory is the problem of 
convergence of laws of sequences of sums of r.v.’s. 

For more than two centuries a particular case 1 —the Classical Limit 
Problem — has been the limit problem of probability theory. The pre¬ 
cise formulation of this case and its solution were obtained in the second 
quarter of this century. At the very time that this particular problem 
was receiving its definite answer, the much more general Central Limit 
Problem appeared, and was solved almost at once, thanks to the power¬ 
ful ch.f/s tool and to the truncation and symmetrization methods. 

§ 20. DEGENERATE, NORMAL, AND POISSON TYPES 

20.1 First limit theorems and limit laws. Three limit theorems and 
corresponding limit laws are at the origin of the classical limit problem. 
Let S n be the number of occurrences of an event of pr. p in n independ¬ 
ent and identical trials; to avoid trivialities we assume that pq 5 ^ 0, 
where q = \ — p. If Xk denotes the indicator of the event in the 々 th 

n 

trial, then S n = n = 1, 2, …， where the summands are inde- 

fc** 1 

pendent and identically distributed indicators — this is the Bernoulli 
case. Since EXk = pi EXk 2 = p and, hence, <r 2 Xk = P — p 2 = pq y it 
follows that 

n n 

ES n = Y, EX k = a^Sn = Y. = npq. 

The first limit theorem of pr. theory, published in 1713, says that 

P , 

—> p. Bernoulli found it by a direct but cumbersome analysis of 


n 
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the asymptotic behavior of the “binomial pr.’s” P[*y w = 是 ] = C n k p k q n ~ k , 

々 = 0, 1，2, 

Sharpening this analysis, de Moivre obtained the second limit theo¬ 
rem which, in its integral form due to Laplace, says that 


The third limit theorem was obtained by Poisson, who modified the 
Bernoulli case by assuming that the pr. p — p n depends upon the total 
number n of trials in such a manner that np n —♦入 > 0. Thus, writing 
now X n k and S nn instead of Xk and S n , the Poisson case corresponds 

n 

to sequences of sums S nn '= 23 X n ki n = 1, 2, where, for every 


fixed w, the summands X n k are independent and identically distributed 

indicators with P[X n k = 1] = ^ + o (!) • By a direct analysis of the 

asymptotic behavior of the binomial pr.’s, much easier to carry than 
the preceding ones, Poisson proved that 


\k 

尸 [Smi = ^] —> 77 <? -X , 走 = 0, 1，2， .... 


Thus are born the three basic laws of pr. theory. 

1° The degenerate law «C(0) of a r.v. degenerate at 0 with d.f. having 
one point of increase only at ^ = 0 and ch.f. reduced to 1. 

2° The normal law 91(0, 1) of a normal r.v. with d.f. defined by 


w= ^£ exp [-k 


dy 


and ch.f. given by 
I{U) = ^ 


r 

2-i 

X: 

1 exp tux - 

" ~2. 


dx 



The well-known value of the last integral is obtained by using Cauchy 
contour integration theorem. 




282 


CENTRAL LIMIT PROBLEM 


[Sec. 20] 


3° The Poisson law CP(X) of a Poisson r.v. with d.f. defined by 

W 入 * 

W 0 E 77 ， 

. A_0 是 1 • 

and ch.f. given by 

/(“)= 厂 x E 〆- =^ _x I ： = 产 

A ； a »0 皮 • A ?_*0 皮 • 


While the first two limit laws played a central role in the development 
of pr. theory, Poisson’s law long stood isolated and ignored. We shall 
see later that there was a deep reason for this isolation and also that, 
unexpectedly enough, Poisson’s law is, in a sense to be made precise, 
more fundamental for the central limit problem than the two others. 
With the notation introduced above, the three first limit theorems 
can be summarized as follows: 


A. First limit theorems. In the Bernoulli case £ 


^s n - ES n ) 


£(0) and £ 
(P ( 入 ) • 


f S n - ES n 
、 <rS n 


31(0, 1 )，while in the Poisson case £(S n n) 


The proof by means of ch.f/s reduces to elementary computations. 
We have, taking limited expansions of exponentials, 


L n ■ 


n £ exp[/«^l 

n . 


E exp iu 


. S n - rip' 


npq J k 


f \iuq-\ 

p exp — 4 - q exp 

、 L » J 


( i+ °(3) - i; 

k 篇 l L V 竹 M • 


^pV\ n 


—tup " 
.y/ npq. 


f f iuq ' 

/ exp L v^J + ?exp 


/ u 2 /« 2 \\ w r u 2 ' 
\ 1_ ^ + 0 W；"" exp ['T 
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£ exp [iuSnn] = n E exp [iuXnk] = (pn exp [iu] + q n ) n 

=(1 H — (exp [iu] — l) + o (- )) 

—exp [X(^ iu — 1)]. 

The three first limit laws give rise to the three first limit types : 

the degenerate type of degenerate laws £(a) with /(«) = e lua ; 

the normal type of normal laws 91 (a, b 2 ) with/(«) = exp 

the Poisson type of Poisson laws (P(X; a y b) with 

/(“）= exp [iua + \(e iub — 1)]. 

The three first limit theorems extend at once by means of the con¬ 
vergence of types theorem; we leave the corresponding statements to 
the reader. 

*20.2 Composition and decomposition. The three first limit types 
possess an important closure property. Its deep parts are the normal 
and the Poisson “decompositions” discovered between 1935 and 1937. 
P. Levy surmised and Cramer proved the first one and, then, Raikov 
proved the second one. 

Let «C(X), £>{X 2 ) be laws of r.v.’s with corresponding ch.f/s 

/>/i>/ 2 - We say that £>{X) is composed of £(Xi) and £(X 2 ) or that 
and £(^ 2 ) are components of £{X) if, X\ and X 2 being inde¬ 
pendent, £(X) = £(Xi + X 2 ) or, equivalently, if/ = / 1 / 2 . 

A. Composition and decomposition theorem. The degenerate and 
the normal types are closed under compositions and under decompositions. 
The same is true of every family of Poisson laws (P(X; a i b) with the same b. 

To avoid exceptions we consider degenerate laws as degenerate normal 
and as degenerate Poisson ones. 

Proof. 1° Closure under compositions 


tua 


~2 


u 


£>{a\) * £(a 2 ) = £(«i + «2) 

^1 2 ) * 9 l (<?2> ^2 2 ) = 9^0*1 + 》1 2 +》2 2 ) 

( P (入 1; <*1， 彡） * (?(入2; “2,占） ==( P(Xj 4- \2； d \ 4- «2> V ) 
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exp [iuai + Xi(<? tw6 — l)】.exp [iua 2 + X 2 (^ m6 — 1)] 

=exp [iu{a\ + a^) + (Xi + ^ 2 ){e xub — 1)]- 

The decomposition property of the degenerate type is immediate. 
For, if for every u C ^>/i(«)/ 2 («) = <? tua > then | /i | | /2 | = 1 and, since 
l/i I ^ 1, I /2 I ^ 1, it follows that |/i I = I /2 I = 1, so that by 14.1a 

/“ 《 )= ， * 、 Mu) = uCR. 

The proof in the normal and Poisson cases is much more involved. 
To begin with, we can, by a linear change of variable, make a = 0 and 
彡 =1 in the laws to be decomposed. Thus, we have to seek ch.f.’s 
/1 and /2 such that, for every u €1 R y 

_ iff 

/ 1 W/ 2 ⑷ =e 2 
or 

fi(u)Mu) = 


2° We consider first the normal decomposition and apply 15.3A. 

Since e 2 is an entire nonvanishing function in the complex plane, 
the same is true of f\{z) and / 2 (2), and there exists a constant f > 0 
such that |/i(z) | ^ Therefore, upon taking the principal branch 

oflog/i(z) (vanishing at u = 0), it follows from the Hadamard factoriza¬ 
tion theorem that log/i ( 2 ) is a polynomial in z of, at most, second de¬ 
gree. Since/i(“）being a ch.f., reduces to 1 at « = 0, equals /i( —«), 
and is bounded on R, it follows that 


. h 2 

log/i («； = iuai -— U 


2 


uCRy 


where a and b are real numbers. Similarly for/ 2 (“), and the normal 
decomposition is proved. 
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3° It remains for us to consider the Poisson decomposition. Let 
Xi and X2 be two independent r.v.’s with d.f/s Fi and F 2i and let F 
be the d.f. of their sum. Since 

[ai ^ Xi < ^i][<*2 ^ X 2 < ^ 2 ] Cl [ai + S + 尤 2 <+ 占 2] 
and Xi, X2 are independent, we have 

⑴ 彡 1)*^2[“2, 彡 2 ) ‘ F [^\ + <*2> ^1 + ^ 2 ) 

and, letting b\ t b 2 —> <», it follows that 

⑵ F(ai ++<? 2 ) ‘ -^ 1 (^ 1 ) + 

Let now ai and o ：2 be points of increase of Fi and respectively. If 
a：i d G*i, b\) and a ：2 d (<? 2 > 彡 2 ) whence ai + a ：2 d (<?i -f- <* 2 ， 彡 1 + 彡 2 )， 
then the left-hand side in (1) is positive and, hence, «i + «2 is point of 
increase of F. Moreover, if «i and <*2 are first points of increase, then, 
taking a\ < «i and a 2 < a 2 in (2 )， we have F{a\ + ai) = 0, and, 
hence, «i + a 2 is the first point of increase of F. 

Now let F be the Poisson d.f. corresponding to CP(X); its only points 
of increase are 走 = 0, 1, 2, .... Therefore, on account of what pre¬ 
cedes, all points of increase «i and «2 of its components F\ and F 2 
are such that «i + «2 = some k and the first points of increase are 
a and —a where a is some finite number. It follows, replacing Fii^) 
by Fi(x — a') and F 2 (x) by F 2 (x + a) (this does not change F), that 
the new d.f.’s have 走 = 0, 1, 2, • • • as the only possible points of in¬ 
crease. Thus, we can set for the corresponding ch.f.’s 
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and it follows that 


ak ^ 


1 \ k e~^ 
k\ 


U( Z )| 


Thus , 沪 1 ( 2 ) and similarly ^(z) are nonvanishing entire functions at 
most of first order. It follows from the Hadamard factorization theorem 
that they are of the form e et+c '. Since f\{u) reduces to 1 at« = 0 and 
is bounded by 1, we have 


log/i(«) = Xi(^ t，M - 1), Xi ^ 0. 

Similarly for and the Poisson decomposition is proved. This 

terminates the proof of the theorem. 


§21. EVOLUTION OF THE PROBLEM 

21.1 The problem and preliminary solutions. From the time of 
Laplace and until 1935, the limit problem aims at weakenings of the 
assumptions under which the law of large numbers (convergence to «C(0)) 
and the normal convergence (convergence to 91(0, 1)) hold. This clas¬ 
sical problem can be stated as follows: 

n 

Let S n = Yh Xk be consecutive sums of independent r.v.'s. Find condi- 
tions under which 

/ S n 一 ES n \ / S n 一 ES n \ 

―-— ) — £(0), £ (^― -- ■- 一 91(0, 1). 


It is implicitly assumed, in the first case, that the summands are 
integrable, and in the second case that their squares also are integrable. 
To simplify the writing, we shall center the summands at expectations, 
so that, in this section 、 EXk = 0, ES n = 0. We also set /*(«) = Ee iuX ^ y 
ajt = aXk and s n = aS n} and exclude the trivial case of all summands 
degenerate. 


Although not the first historically, the solution of the extension of 
the Bernoulli case to independent and identically distributed sum¬ 
mands (not necessarily indicators) is immediate — when ch.f.’s are used. 

A. If the summands are independent、identically distributed^ and cen¬ 
tered at expectations、then £ IS —» 叫0) and £ ~^ 91(0, 1). 
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For, if/ is the common ch.f. of the summands, then, by using its limited 
expansions, we have 

一 [/“J] = (/Q) R = (… QJ 、 

and, since s n 2 = na 2 > 0, 

£ e«] = (/Q) W = + 

= 0 一 l + 0 &)) — exp [一 T]. 

However, the first reasonably general conditions are the following. 

n 

B. Let S n = H and s n = aS ni where the summands are independent 

r.v.'s centered at expectations. 

1 n , 

(i) If —rr; | 1+8 —^ 0 for a positive 5^1, then 


” 1+ W 


© 


£ ⑼. 


(ii) If — Yh ^1 Xk | 2+J —> 0 for a positive 5, then 

s n *=i , _ . 


(9 


91(0, 1). 


The assumptions imply finiteness of moments £| Xk | 1+J and 
£| Xk | 2+J , respectively. 

The first assertion is slightly more general than the classical ones. 
For 5 = 1, it becomes the celebrated Tchebichev’s theorem. It also con¬ 


tains Markov's theorem: if £| Xk | 1+i ^ c < <x>y then £ (^) 


增 


(since, then, the asserted condition becomes —g —» 0); since, for 5 > 1, 

EXk 2 ^ (£| Xk | 1+8 ) 2/1+8 Markov’s theorem is valid with any 5 > 0. 

The second assertion is the celebrated Liapounov’s theorem which has 
been the turning point for the entire Central Limit theorem. More¬ 
over, while the ch.f.’s were known to and used by Laplace, the first 
continuity theorem for ch.jf.’s: 

if / w («) — 2 , then £(Zn) — 91(0, 1), 
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is to be found, proved but not stated, in Liapounov’s proof of his theo¬ 
rem. We observe that (ii) has content only when at least one of the 
r.v.’s is not degenerate at zero and, then, the hypothesis implies that 
s n —♦ oo. 

Proof. 1° To begin with, let us reduce in (ii) the case 5 > 1 to 

5 = 1, so that it will suffice to assume that 0 < 5 ^ 1. 

, . 1 w 
Let y be a r.v. whose d.f. is - Fk and, hence, 

n jk_i 

E\ Y\ r = - E E\ X k | r . 

n fc-i 

According to 9.3b. log £|y| r is a convex from below function of r > 0. 
Therefore, for 2 + 5 > 3, we have 

5.1og£| y| 3 g (5 - 1) log£| r| 2 + log£| Y\ 2+s 

or, equivalently, 

1 n /I n \l/6 

It follows that, if the condition in (ii) holds for a 5 > 1, then it holds 
for 5=1. Thus, in what follows we can limit ourselves to 0 < 5 ^ 1. 

2° We use limited expansions of ch.f/s, the continuity theorem, and 
the expansion log (1 + z) = z + o(| z |) valid for | z | < 1. As usual, 

6 with or without affixes denotes quantities bounded by 1. 

Condition (i) implies that 

E\X k \^^ 1 : Hvll+j ^ 

—o ， 


so that, for u arbitrary but fixed, 



2 1 - 4 (1 , g E\ X k | 1+j 

i+ —M«r 8 -^rrr- 



uniformly in k ^ n. Therefore, for n sufficiently large, 




and the first assertion is proved. 
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Condition (ii) implies that 

© 


max 
fc 备 n \s n / 


2+s E\ X k | 2+s 
^ max -‘ 


k^n S n 


2+S 


Sn 2+S k 


ZE\X k 


2+S 


0, 


so that, for u arbitrary but fixed, 


O 1 - 皆 . 


2 


2 I_i 


9 f nk\ U I 2 


+S E\X^ 


V 5 (1 + 5)(2 + 5) Jn 

uniformly \n k ^ n. Therefore, for n sufficiently large, 

二， /«\ « 2 

Z log/* (—) = — — (1 + 0 ( 1 )) 


2+6 


+ 20^|«| 2 +«-— ZE\X k \ 2 


+5 


and Liapounov’s theorem is proved 




u 

~2 


Bounded case. If the summands are uniformly bounded、then 
£>{SJn) -» 公⑼.仏 moreover^ s n -> «, then £(S n /s n ) -» 91(0, 1). 

For, if I ^ I ^ f < oo, then E\ X k | 1+j ^ f 1+j and E\ X k | 2+j ^ r 4 ^ 2 , 


and, hence, 


1 « f i+« 


0, 


^k ElXk{2+s -i^° as w 


Tools for solution. The preceding theorem is not satisfactory since 
moments of higher order than those which figure in the formulation of 
the problem are used. Yet a restatement of this theorem with 5=1, 
together with the truncation method, will provide the stepping stone 
towards the solution. 

n 

a. Basic lemma. If S nn = X n ky where the summands are inde- 

fc »■ 1 

pendent r.v's {centered at expectations) ^ then 


(i) 


(ii) 


if - ZE\ X nk I 2 0, then £ (~j - £(0) 

if A E£|^n fc | 3 

s nn k^l 


— 0 ， then £( —) ^ 91(0, 1). 
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It suffices to replace in the proof of 21.2B subscripts k and n by double 
subscripts nk and nn y respectively. 

In order to use the truncation method we shall require a weak form 
of the equivalence lemma. We say that two sequences £,{X n ) and 
£(X f n ) of laws are equivalent if, for every subsequence & {X n >) —» £,{X) i 
we have £>{X' n >) —» £(X), and conversely. 

p 

b. Law-equivalence lemma. If X n — X' n 0 or P[X n ^ X' n \ 
—0, then the sequences £(X n ) and & {X' n ) of laws are equivalent. 

For the second condition implies the first one which, by lO.ld, implies 
the asserted equivalence. 


21.2 Solution of the Classical Limit Problem. We are now in a 
position to give a complete solution of the problem. 

Xi, X 2 , - - - are independent r.v.’s centered at expectations, with 
d.f/s F u F 2} •••, ch.f.’s /i, / 2 , and variances <ri 2 , <r 2 2 , 

n n 

S n = Xk are their consecutive sums with variances s n 2 = • 

jk-i • 

To simplify the writing, we make the convention that all summations 
are over k = 1, • • • y n. 


A. Classical degenerate convergence criterion. £ 


⑼ and only if, 

CO 


^L/ Fk 


0, 


(H) 


(iii) 


— X I X dFk — 0, 

Tl J\x\<n 


AL<f dFk ~^L<n dF )\^ 0 ' 


n" l Jui <« 

Proof. 1。 Let (i), (ii), and (iii) hold. We wish to prove that 

£ ^ » £(0). In what follows we apply the law equivalence lemma 

and the first part of the basic lemma. 

Let S nn = E Xnky where X nk = X k or 0 according as | ^* | < w or 

I Zjk I ^ w. On account of G) 


P 


Snn 
- 5= 

. n 


-1 ^ E P[Xnk 

n . 


^x k ] 


严 


0, 
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so that it suffices to prove that £ > £(0). But, on account of 

(ii), 

— ES nn = — 23 r x dFk —> 0, 

n n J\x\<n 


. ( *S*nn — £*S*nn\ . 

so that it suffices to prove that £ ( - J —> £(0). But this 

follows, by Tchebichev inequality, from (iii) and 

-5 E E\ X nk - EX nk I 2 


= - 2 z\ f ^dF h -( f xdF k ) 2 } 0 . 

» 丨 ^lar!<n \^\x\<n / J 

2° Conversely, let £ (—\ £(0); equivalently, — —> 0 org n (u)= 

\n / n 

n 

XI fki u / n ) 1 uniformly on every finite interval. Let n be suffi- 

k 现 1 

ciently large so that log | g n (u) j is bounded on [—c t +f】. By the weak 
symmetrization lemma and the second truncation inequality 

2 ti L n - 」 - 合 t L » -」 

^ 7J* log I gn(v) I 2 dv 0. 


Since 


X n S n n — \ «S*n—1 p 


n n 


n n 


so that fiX n /» —> 0, it follows that the foregoing relation with f > 1 yields 
(i) and, hence, £>{S nn /n) —> £(0). But, by the first truncation in¬ 
equality, 


⑴ 2 E <r\X nk /n) = E <r 2 (X nk a /n ) 芸 一 3 log | 以⑴ | 2 — 0, 


so that (iii) holds, and, by Tchebichev inequality, 
Therefore, 

ES nn S nn S nn _ ES nn 


Snn — ES n 


4 0. 
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and (i.i) holds. The proof is completed. 

Observe that centering at, and in fact existence of, expectations were 
not required. Also, according to the proof, 


£ 



—> 七⑼ <=» (i) and (iii) hold. 


B. Classical normal convergence criterion 

(T]c • 

and max - > 0 i/ y and only if 、 for every e > 0, 

k^n S n 


• £ 0 


91(0, 1) 


gn(i) 


广 


.2 


J\ * l^(«K 


dF k —> 0. 


The “if” part is due to Lindeberg and the “only if*’ part is due to Feller. 

Proof. 1。 Let g n (e) —» 0 for every e > 0. We apply the law 
equivalence lemma and the basic lemma. 

Since g n (^) 0 for every c > 0, there is a sufficiently slowly de¬ 
creasing sequence e n J, 0 such that -—5 ^n(^n) 0 and, a fortiori ， 

€n 

~ gn(^n) 0, gn(^n) — > 0 (it suffices to select a sequence »*： f <» as 
€ n 


oo such that g n 


(d < h forn 


^ »*： and, then, take e n = - for 


^ < »fc+i). We have 


max^ ^ max 


4 / 士 


dFk + € n 2 ^ gn(^n) + € n 2 


0, 


^ Sfi k Sn J a; J^ <n*»» 

and the “if*’ assertion will be proved if we show that £ ^ ^ 91(0, 1). 

Let Xnk = Xk or 0 according as | | < e n j n or | | ^ e n s n . Since 


P 


Snn Sri 
- 〆 - 




J n 」 

it suffices to prove that £ 


Z P[Xnk y 6 Xk] = T, f dF k ^ A 《 n(€n) — 0, 

•^1 * € n*n 


£) 


* 1^ «n«n 

91(0, 1). 


Since the Xk are centered at expectations, we have 


I EX n k I = f x dF k 

*/1 S I 


f xdF k r 

•^1 * <n*n tl ^l * I = <n*n 


dF k . 
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Therefore, 


~Z\EX nk \ 

S n € n 


and, setting s nn 2 = a 2 S nny we obtain 




gn 2 (^n) 


0. 


€n 


Thus, it suffices to prove that £ f — - ^ nn \ _> 沉 (0, 1). But, this 


follows from 
1 


^nn 


E E\ X nk - EX nk I 3 E(X nk - EX nk ) 2 ^ 2c n Jn 


^nn 


^nn 


o, 


and the “if*’ assertion is proved. 

2° It remains to prove the “only if” assertion. 

(Tjfc 

Since max - > 0, it follows from 

k^n S n 


普 1 


that 


max 

k Sn 


U(f)- i| - o, i | 2 

Therefore, for n sufficiently large, log /*： ( — ) exists, so that 


0. 


E exp 


. 夂 . 

tu — 

^n- 


UA (-) 

ksal 


exp 


u 2 ^ 

2」 


becomes 


zlog/fc Q_ 


2 


and, since log 2 = 2 — 1 + 2 — 1 | 2 , 
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Upon taking the real parts, we obtain 


--zf 

2 «/1 * I <€* n 


Since 


(1 - cos — ) dF k 


s n / 


E f ( 1 - cos — dF k + 0 ( 1 ). 

€ *n\ Sy\J 


zf (l-cos-)^F k ^^zf ^dF k 

*/| 2 I <«* n \ SY\,f •/] * J <c9f| 


u ‘ 


(Sn 2 - E 


U‘ 


dF k ) =-(l - g n (e)) 


and 


Isr? ' - ^J\x\^ •” 2 

O O - 2Z Li>.j Fk 

2 zf 




it follows that 




2 


2 

^n(e) ^ + 0(1). 


Therefore, letting » —> <» and then « —> « in 

o “( e ) 4(長 + 0 (1))， 


we obtain g n (^) —*■ 0. This concludes the proof. 

*21.3 Normal approximation. In his celebrated investigation of nor¬ 
mal convergence, Liapounov examined not only conditions for, but 
also the speed of, this convergence. His results were greatly improved 
by Berry (and, independently, by Esseen) and to present the basic one 
we shall proceed in steps. 

Let F and G be d.f.’s of r.v.’s with corresponding ch.f/s / and 
g y and let H = F — G } h = f — g. We exclude the trivial case of 
a = sup j H j =0, that is, H = k = 0. 


9l. If G is continuous on R y then there exists a finite number s such that 
either H(s) = 干 <* of H(s 4* 0) = a. 
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Proof. Let x n be a sequence such that | H(x n ) | —> a. It contains a 
subsequence x n > —> s finite or infinite. Since H(x) — > 0 as x —> 干 《 
and a > 0, j must be finite. 

The sequence x n ，contains a subsequence x n " such that either 
H{x n >>) —a or H{x n ») —> +«• It suffices to consider one case only, 
say the first, for the same argument is valid for the other. Thus, let 
x n » —> s, H{x n ») —> —a; we know that H is continuous from the left. 

If the sequence x n ，> contains a subsequence converging to s from the 
left, then — a = lim H(x n >>) = H(s) y and the assertion is proved. 
Otherwise, this sequence contains a subsequence converging to s from 
the right, —a = H(s + 0) and, G being continuous on R, 

—a ^ H(s) ^ F(s + 0) — G(s) = F(s + 0) — G(s + 0) = —a, 

so that —a = H(s). The assertion is proved. 

Let p be the derivative of a symmetric d.f. (of a r.v.) differentiable 
on R y so that p(x) = p(—x), x[R. 

b. If G has a derivative G’ on R，then there exists a finite number 
such that 

JH(x + a)p(x) dx 


^ ~ (1 — 6 I p{x) dx) y j8 = sup \G f \. 
2 J5L 


(X __ e a 

Proof. If j8 = oo, then — = 0, and the inequality is trivially true 
whatever be a. Thus, it suffices to prove it when j8 < «. Let 


7 


a 


> 0 . 


We have, for an arbitrary a. 


⑴ 


and 

( 2 ) 


JH(x + a)pix) dx 

L H(x + a)p(x) dx 
<y 


M * 


H(x + a)p(x) dx 




I H(x + a)p(x) dx 


a I p{x) dx. 

* I 彡 7 


On the other hand, according to a, there exists a finite number s such 
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that, say, —a = H(s). For | x | < 7 , we have, setting a = s — y 
that 

s — 2y<x-\-a<s y x — y <0, 

the relation 

G(x + a) = G(s) + d(x - y)G , (x , ) y | 0 | S 1 ， s - 2y < x f < s. 

Thus, for I x I < 7 , 

H(x a) = F{x - a) — G(s) — d{x — y)G , (x , ) 

^ F(s) - G(s) - j8(x - y) 

=—a — j8(x — 7) = + 7), 


and it follows that 


h x 1 <7 


H(x + d)p{x) dx ^ —j 8 1 (x + y)p(x) dx 

J\x\<y 


-)87 I pW dx 

J\x\<y 


=s — — (1 — I p(iV) dx)* 

2 J\x\^y 

Upon substituting in (1) the bounds given by (2) and (3), we obtain 

J*H(x + a)p{x) dx ^ - (1 — 3j|* ^ p{x) dx) 

and the assertion follows. In the case a = H(s + 0), the argument is 
similar. 

Let « be a real ch.f. with J | «(«) \ du < ^ so that the correspond¬ 
ing d.f. has a symmetric derivative continuous on R y given by 


— J ^ - ****«(«) du = — J 9 cos ux'u(u) du. 


c. For every aC.R 


1 r[^M du ^ {H{x + a)p{x) dx . 
2 tt J u J 
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h{u)oi{u) % 

Proof. We can assume - to be integrable, for otherwise the 

' u ' 

inequality is trivially true. According to the composition theorem, 
hu is the Fourier-Stieltjes transform of U defined by 


H(x) = j H(x - y)p(j) dy. 

„ h(u)o)(u) . , 

Since - is integrable, the inversion formula yields 

u ' ' 


沒 (x) 一 71 (x f ) = ~ J* 


e~ iux - e ~ iux， 


h{u)o}{u) du. 


But, as x' — > —oo, H(x , ) —> 0 and, by the Riemann-Lebesgue theorem, 

fe ~ iux， —— 0. Therefore, 

J —iu 

J - y)p(y) dy = ^j e . iux m^ du 

and, hence, replacing x by a y y by —x, and taking into account that p 
is symmetric, we obtain 

J H(x + a)p(x) dx = Y^ e ~ iUa du - 

The asserted inequality follows. 

We are now in a position to establish the basic inequality below, of 
independent interest. We shall require a real integrable function 〜 

defined by w 0 («) = 1 — or 0 according as | «| < C7 or | «| ^ C7. 

Its Fourier-Stieltjes transform p 0 is given by 


1 — cos Ux 


1 C +u / u |\ 1 — cos Ux 

MX) = 5 J—J 1 一 C ° 讀而 = — 

and we have po ^ 0, J*po(x) dx = 1, so that w 0 is a ch.f. 

A. Basic inequality. If G has a derivative G' on R y then、for every 
C7>0, 

,. 2 r u h(u) 24 ,, 

sup I // 1 ^ - I - du + — sup I G I . 

ttJ 0 u tu 
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Proof. Upon replacing <a and p by w 0 and p。，the propositions b and 
c yield the inequality 


i r 

h{u) 

丄 f 

h{u)o} 0 {u) 

2ir J -V 

u 

2irJ 

u 


^ - (1 — 6J p 0 (x) dx) 


where 


Po(x) dx 


7T 


Therefore, 


1 '^. U l dx ^lf d -1 = JL 

x 2 U ir Jyu x 2 iryl 


a 123 

du ^ —-- 

2 irU 


40 


ira 


U 


f u 

h{u) 


u 


and the asserted inequality follows. 

In order to apply the basic inequality to the normal approximation 
problem, we have to bound the corresponding h. Let F* n and G* be 


u* 


the d.f.’s of £ y and 91(0, 1) and let h* n = f* n — e 2 denote the 

difference of the corresponding ch.f.’s. The summands X n are inde¬ 
pendent r.v.’s centered at expectations, and we set y n 3 = £| X n | 3 , 

23 n 

gn = —j H 7 *： 3 . We exclude the case of one of the y n infinite, for 
Sn fc-l 

then the normal approximation theorem below is trivially true. 


2 




d. If\u \ < —, then | h* n {u) | ^ 2g n z \ u | 3 exp 
Sn 

Proof. 1° First, we prove the assertion under the supplementary 
condition | u | ^ —. Then ^ n 3 | « | 3 ^ 1 and it suffices to prove that 

gn 


ur 


I h*n(u) I ^ 2 exp 

I h*n{u) I ^ |/*n(«) I + exp 
it will suffice to prove that |/* n («) J 2 ^ exp 


But, since 
' ~1 


^ I 尸 n(«) I + exp 
2u 2 




Consider the symmetrized r.v. Xk — X'k where Xk and X\ are inde¬ 
pendent and identically distributed, so that its ch.f. is \ fk | 2 and 

E(X k - X f k ) 2 = 2c k 2 y E\X k -X f k \ 3 ^ 2 3 y k 3 < 
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Therefore, 

\fk{u) | 2 ^ 1 - <rjfc 2 « 2 + y 7fc 3 | « | 3 ^ exp -<r fc 2 « 2 + y 7 fc 3 | u 


and, replacing u by - and summing over 是 =1, ..•，》, we obtain, 
using the fact that, by assumption, ^ n 3 | « | < 2, 


|/*n(«) I 2 ^ exp 


一 《 2 + 


gn 


3 | u I 3 


3.2 


^ exp 


« 2 + 


u 


2 -i 


exp 


2 

3 




2° It remains to prove the assertion when | « | < — and, hence, 

Sn 




M = — I«I ^~\u \ <i- 


Then, we have 


/fc (3 = 




ff k 2 , , ,, 7 fc 3 I I,, 

U 2 + d — \ u \ 3 = l - r ky 


2s n 


6s n 3 


where | r k | < 士 ， so that 

iog/fc (f) = ~r k + e 、 r k 2 . 

On the other hand, 

I 心⑧ 、標 

Ul 3 


so that 


0 


log/* \ — ' = 


灯 k 


受 1 “ |3 


and, summing over k y we obtain 


log f*n(u) = - y + U I 3 . 
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Since, for every number a t e a = l + d r a e 9 

„ 3 1 






so that e a < 2 , that 

f*n(u) - exp 


,,2*i 


U 

~ 2 \ 


^ 2 ^ I « I 3 ex P 


,2i 


U 

2」 


^ 2gn 3 \ « I 3 exp - yJ» 

and the proof is complete. 

B. Normal approximation theorem. There exists a numerical con¬ 
stant f < oo such thator all x and all n y tf F* n is d.f. of £(S n /s n ) and 
G* is d.f. of 91 (0, 1), then 

I F\(x) - G*(x) I ^— 3 i ： E\X k \ 3 . 


For, upon replacing h* n by its bound obtained above in the basic 
inequality with U = } F = F* ny and G = G* hence sup | G r | = 




,we obtain 




t? exp 


u 


2-i 


du + 


\/27r* 


). 


§ 22. CENTRAL LIMIT PROBLEM ； THE CASE OF BOUNDED 

VARIANCES 

22.1 Evolution of the problem. The classical limit problem deals 
with independent summands X n with finite first moments and, in the 
normal convergence case, with finite second moments as well. Those 
moments are used for changing origins and scales of values of the con- 

n 

secutive sums S n = ^2 Xk so as to avoid shifts of the pr. spreads 

1 

towards infinite values. There is no reason for these choices of “norm- 
ing” quantities except an historical one; they are a straightforward 
extension to more general cases of the norming quantities which ap¬ 
peared in the Bernoulli case. A priori, there is no reason to expect 
that these qtlantides will continue to play the same role in the general 
case. Furthermore, whether they are available (that is, exist and are 
finite) or not, other choices might achieve the same purpose. Thus, 
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the problem becomes a search for conditions under which the law of 

large numbers and the normal convergence hold for normed sums 

S n . . 

- - a n . The methods remain those of the classical problem, but the 

b n 

computations become more involved. However, remnants of the two 
first limit theorems in the Bernoulli case are still visible. For there is 
no other reason to expect or to look for limit laws which are either de¬ 
generate or normal. 

The real liberation which gave birth to the Central Limit Problem 
came with a new approach due to P. L6vy. He stated and solved the 
following problem: Find the family of all possible limit laws of normed 
sums of independent and identically distributed r.v.’s. We saw that 
when these r.v.’s have a finite second moment, the limit law (with 
classical norming quantities) is normal. Thus, P. Levy was concerned 
primarily with the novel case of infinite second moments and finite or 
infinite first moments. 


Naturally, the question of all possible limit laws of normed sums 
with independent, but not necessarily identically distributed, r.v.’s 
arises at once. Yet, the Poisson limit theorem is still out, for it is rela¬ 
tive to sequences of sums and not to sequences of normed consecutive 
sums. Moreover, as we shall find it later (end 24.4), under “natural” 
restrictions Poisson laws cannot be limit laws of sequences of normed 


. . ... S n 

sums — which explains their isolation. But sequences - - a n are a 

b n 


particular form of sequences 2Z 又 》* ( set X n k 


Xk 

bn 


% 


and this 


provides the final modification of the problem. 

The general outline of the Central Limit Problem is now visible: 
Find the limit laws of sequences of sums of independent summands and 
find conditions for convergence to a specified one. Yet, so general a 
problem is without content. In fact, let Y n be arbitrary r.v.’s, set 
X n i = Y n and X n k = 0 a .s. for 是 > 1 and every n. Then the sequence 
of laws becomes the sequence £(Y n ) y so that the family of possible limit 
laws contains any law £ — take £(Y n ) = £. Thus, some restriction is 


needed. 

To find a “natural” one, let us consider the problems which led to 
this one. Their common feature is that the number of summands in¬ 
creases indefinitely and that the limit law remains the same if an arbi¬ 
trary but finite number of summands is dropped. To emphasize this 
feature, we are led to the following “natural” restriction: the summands 
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X n k are uniformly asymptotically negligible («<*»)，that is, X nk 
uniformly in k or, equivalently, for every e > 0 , 

max P[| X nk I ^ e] 0. 

k 

Finally, the precise formulation of the problem is as follows : 


Central limit problem. Let S n 


2 Z Xnk be sums of uan inde¬ 


pendent summands with k n —> «. 

1° Find the family of all possible limit laws of these sums. 

2 ° Find conditions for convergence to any specified law of this family. 

To simplify the writing, we make the following conventions valid for 
the whole chapter. 

(i) 是 = 1 , ... ， k ny k n —» oo, the summations the products IJ> 

k k 

the maxima max, are over these values of k y and the limits are taken as 

k 

usually for » —> unless otherwise stated. 

(ii) F n k and f n k denote the d.f. and the ch.f. of r.v.’s X n k t F n and f n de¬ 
note the d.f. and the ch.f. of 2Z X n it. Thus, the uan condition becomes: 

max I dFnk 0 for every e > 0 , and the assumption of independ- 

ence becomes f n = Il/njfc. The problem becomes 

k 


\° Find all 


Given sequences f n = Wfnk of products of ch.fs of uan r.v.’s: 1 ° Find all 

k " 

ch.j's f such that f n —> /; 2° Find conditions under which f n —> / given. 

If these ch.f.’s have log’s on I = [—U y -\-U], we always select their 

principal branches — continuous and vanishing at « = 0, and then on I: 

log/» = 2 ^og/ nky / n — j (uniformly) <=> log/n — log/ (uniformly). 
k e 

The solution of the problem is due to the introduction, by de Finetti, 
of the “infinitely decomposable” family of laws and to the discovery 
of their explicit representation by Kolmogorov in the case of finite 
second moments and by P. Levy in the general case. 

It has been obtained, with the help of the preceding family of laws, by 
the efforts of Kolmogorov, P. L 6 vy, Feller, Bawly, Khintchine, Marcin- 
kiewicz, Gnedenko, and Doblin (1931-1938). The final form is essen¬ 
tially due to Gnedenko. 

22.2 The case of bounded variances. As a preliminary to the in¬ 
vestigation of the general problem, and independently of it, we examine 
here the particular “case of bounded variances” — a “natural” extension 
of the classical normal convergence problem. It is much less involved 
computationally than the general one, while the method of attack is 
essentially the same. 
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We consider sums 22 X n k of independent r.v.’s，centered at expecta- 

k 

tions, with d.f.’s F nky ch.f.’s f n k and finite variances o n k = a 2 X n k such 
that 

(C): max Onk 0 and o n ^ ^ f < «, where c is a constant 

k k 

independent of n. 

Since, for every € > 0 ， 

max P[\ X nk I ^ c] ^ 4 max <r nk 2 -> 0, 

k € k 

the uan condition is satisfied and the model is a particular case of that 
of the Central Limit Problem. The boundedness of the sequence of 
variances of the sums entails finiteness of the variance of the limit law. 

a. Comparison lemma. Under (C), log/ n *：(«) exists and is finite for 
n ^ n u sufficiently large and，for any fixed «, 

{log/nfc(«) — (,nfc(«) — 1)} — 0. 

k 

a ^2 

Proof, Since f n k(u) = 1 —« 2 > it follows from (C) that 

max I fnM — 1 I ^ — max <r nfc 2 - > 0, $3 |/nfc(«) — 1 | = - « 2 * 
k 2 k k 2 

Therefore, for n ^ n u sufficiently large, |/ n *；(«) — 1 j S 士 ， so that the 
log/ n fc(«) exist and are finite, 

log/nfc(«) = fnk{u) — 1 + ^nfc|/nfc(«) — 1 | 2 , 

and it follows that 

I E {log/nfc(«) - i/nM - 1)}| 

U ^ E \U{U) - 1 I 2 

k 

^ max |/nfc(«) — 1 I Z \fnk{u) — 1 I — 0. 
k k 

The comparison lemma is proved. 

Let 

Mu) = E LfnM - l) = E /V* - l) dF nk . 

k k J 
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Since 


we have 


Jx dF nk = 0, if ， dFnk ^ C, 
么 ⑷ =E J (^ UX - 1 - tux) i . X 2 dF nk 


Mu) = J (^ ux - 1 - tux) -dK n> 

where K n on is a continuous from the left nondecreasing function 
with K n (— <») = 0, Var K n S c < defined by 


K n (x) 


y 2 dF n ky 


and the integrand, defined by continuity at x 
value — « 2 /2. The comparison lemma becomes 

a / . Under (C), log Ylfnh — t — 0. 

k 


0 , takes there the 


Functions of the foregoing type will be denoted in this subsection by 
♦ and K y with or without affixes. Thus, unless otherwise stated, ^ is a 
function defined on R by 

yp(u) = f {e iux — 1 — tux) — dK{x) y 


and 尺 is a d.f. — up to a multiplicative constant — with K {— <x>) = 0, 
Var K S and K will have same affixes if any. 

b. Every is a ch.f. with null first moment and finite variance <r 2 = 
Var K y and is a limit law under (C). 

Proof. The integrand is bounded in ^ and continuous in u (or x) for 
every fixed x (or «). It follows that 屮 is continuous on R and is limit 
of Riemann-Stieltjes sums of the form 53 \i ua nk + Xnfc(^ ，u6n * — 1)} 

k 

where 

1 

入 nfc = « ^[ x nky ■ x ， n,fc4-l)> a nk = _ ^nk x nkf ^nk = x nky 

Xnk 

we can and do take all subdivision points x n k 9^ 0. Since every sum¬ 
mand is log of a (Poisson type) ch.f., the sums are log of ch.f/s, and so 
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is their limit 屮 according to the continuity theorem. The second asser¬ 
tion follows, since, by elementary computations, 

(e*Yu^o = ⑹ , = 0， （/)" u _ 0 = (r)u~o = - Var K. 

Finally, let X n ^ 是 = 1， • • •，》be independent r.v/s with common log 
of ch.f. being 屮 /«• Since 令 /n corresponds to K/n y we have <r 2 X n k 

n 

=Var K/n while EX n k = 0- Since 53 X n k has for ch.f. 〆 whatever 

k 観 X 

be n and condition (C) is fulfilled, the last assertion is proved. 

c. Uniqueness lemma. 屮 determines K, and conversely. 

Proof. Since 

-〆'(《)= J e iux dK(x\ Var A ： < «, 尺 ( 一 《) = 0 ， 

the inversion formula applies and K is determined by if/ by means of 
少 〃. The converse is obvious. 

W 

d. Convergence lemma. Let (C) hold. If K n K, then rf/ n —*■ yp. 

■ W 

Convers.ely i if — log/, then K n K and log / = ^ determined 
by K. 一 

Proof. The first assertion follows at once from the extended Helly- 
Bray lemma. As for the converse, since the variations are uniformly 
bounded, the weak compactness theorem applies and there exists a 

K (with Var K ^ c) such that K n > 二尺 as »' —> « along some subse¬ 
quence of integers. Therefore, by the same lemma, 令 n . — ^ = log/ 
since 屮 n —> log/. But, by the uniqueness lemma, ^ = log/ deter- 

W 

mines K, and it follows that K n —*■ K. The proposition is proved. 

Upon applying the foregoing lemmas, the answer to our problem 
follows: 

A. Bounded variances limit theorem. If independent summands 
X n k are centered at expectations and max a n k ― > 0, $3 a nk ^ f < 00 

k k 

for all n，then 

1 ° the family of limit laws of sequences £($3 ^nk) coincides with the 

k 

family of laws of r.v's centered at expectations with finite variances and 
ch.f's of the form f = 〆 ， where yj/ is of the form 

屮⑷ ( 户 8 - 1 - 
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with K continuous from the left and nondecreasing on R and 
Var K ^ c < oo; 4/ determines K and conversely. 

2° £(53 Xnk) —> £>(X) with ch.f. necessarily of the form e* if，and 

k 

W 

only i/ y K n K where K n are defined by 

K n (x) = EfV 心 . 

k ^-oo 

If 53 °nk ^ c < <x> is replaced by $3 ^nk < «, then K n K 

k k 

c 

is to be replaced by K n —> K. 

Proof. 1° follows from b, the comparison lemma and the convergence 
lemma. 

2 ° follows from 1 ° and the convergence lemma; and the particular 
case follows from the fact that the assumption made becomes 

Var K n = J ： a nk 2 <r 2 X = Var K. 

k 

Extension. So far the r.v.’s under consideration were all centered 
at expectations. If we suppress this condition and set 

a n k = EX nk} PnjfcCv) = F nk (x + a nk ) y JnM = e~ %uan %M, 

then the foregoing results continue to apply, provided F n k and f nk are 
replaced everywhere by and/ n *；; and then we write 孓 instead of 屮 . 
Going back to the noncentered r.v.’s, we have to introduce limit laws 
£(X) with finite variances but not necessarily null expectations a = EX y 
whose log’s of ch.f.’s are of the form 屮 (《) = iua + 孓 (《)， so that 



The uniqueness lemma becomes : 屮 determines a and K y and con¬ 
versely. 

W 界 

In the convergence lemma, K n — K .is replaced by K n K and 
a n 一 " > ci. 

The same is to be done in the limit theorem with a n = a nk and 

k 

Fnk replaced by T nIe . 

Thus, the convergence criterion A2° becomes 

Extended convergence criterion. If independent summands X n k 
are such that max <r n fc 2 — 0 and ^ a n k 2 ^ f < °°, then Xifc)— 
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£(?C) with ch.f. necessarily of the form e* if y and only i/ y K n 
Yi a nk a where 


K and 


K n (x) 


dF nk(y "f" ^nk)y ^nk = EX n k' 


If 53 ^ c < <x> is replaced by Y, o"nfc 2 — <r 2 X < «, then K n K 

k k 

c 

is to be replaced by K n —> K. 

Particular cases: 

1 ° Normal convergence. The normal law 91(0, 1 ) corresponds to 


屮 (“） 


T 


and, hence, to K defined by K{x) = 0 or 1 according as 


a; < 0 or a; > 0 (because of the uniqueness lemma, it suffices to verify 
that this K gives the above ^). 

Normal convergence criterion. Let the independent summands 
Xnky centered, at expectations t be such that 2 Z Onk = 1 for all n: 


then £(53 U — 91(0, 1 ) and max <s n k 

k k 

every e > 0 , 

gn(e) = zf P dF nk 
fc * IS« 


0 if, and only if y for 


0 . 


Proof. Since 

<Tnk = max I x 2 dFnkix) ^ e 2 + ma 
k J k 


max 

k 


max I x 2 dF nk ^ c 2 + 《 n (€), 


f \x\^€ 


it follows that g n (e) —> 0 for every c > 0 implies (letting n —* ao and 
then e —> 0 in the foregoing relation) max <r n k 2 0- Then, immediate 

k 

computations show that the convergence criterion A2° is equivalent to 
g n (e) —> 0 for every e > 0. 

Upon setting X n k = ~, ^ = 1, • • •, », EXk = 0, s n 2 = <r 2 Xk y 

S n k 

we obtain the classical normal convergence criterion. Liapounov’s 

theorem follows from 

f ^ dF k ^ f I a; | 2+J dF k . 

2° Poisson convergence. The Poisson law (P ( 入 ） corresponds to 
\f/{u) = iu\ + X(^ tu — 1 — iu) = iu\ + ^(«) and, hence, the function 
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K which corresponds to ^ is defined by K(x) = 0 or X according as 
•v < 1 or a; > 1 . The extended convergence criterion yields, by im¬ 
mediate transformations, the following 

Poisson convergence criterion. If the independent summands X nk 

a re such that max <r nfc 2 Q and Y. <r nk 2 \ then £(Z X nk ) -»• (P(X) 

• • k k 
if 、 and only H EX n k —*■ X and, for every e > 0, 


^i x -u^f dFnk{x + EXnk) 


0 . 


*§ 23. SOLUTION OF THE CENTRAL LIMIT PROBLEM 

We consider now the general problem. As was pointed out, the 
method of attack will be essentially the same as in the case of bounded 
variances. The computational difficulties will arise from two facts. 
( 1 ) Even existence of first moments is not assumed, and the center¬ 
ings, instead of being at expectations, will have to be at truncated 
expectations. (2) The functions K defined previously are not necessar¬ 
ily of bounded variation and, even when they are, they are not assumed 
to be of uniformly bounded variation. They will have to be replaced 
• y2 

by functions of the form ^ n (^) = E | — 5 d ^nk where T nk will be 

k J—oo 1 + ^ 

d.f.’s of the summands centered at truncated expectations. This will 
lead to limit laws with log ch.f.’s of a more complicated form, which 
we investigate first. 

23.1 A family of limit laws; the infinitely decomposable laws. A 
law £ and its ch.f. / are said to be infinitely decomposable {i.d.) if, for 
every integer », there exist (on some pr. space) n independent and identi- 

n 

cally distributed r.v.’s X n k, such that £ = £( 2Z ^nk)i in other words, 

A* 1 

for every n there exists a ch.f. f n such that / = / n n . If 〆 0, then 
log/ exists and is finite and f n = log/ ; unless otherwise stated, we 
select for log of a ch./. its principal branch (vanishing at « = 0 ) and for 
the nth. root of/we take the function defined by the preceding equality. 
Clearly, if a law is i.d., so is its type. The degenerate, normal, and 


e u 

Poisson type are i.d., since if log/(«) = iua or iua — a 2 — or iua + 


2 


\(e iub - 1 ), then - log/(a) has the same form whatever be ft. More 

ti 
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generally, the limit laws 〆 obtained in the case of bounded variances 
are i.d., since the corresponding functions if/ are such that if//n is log of 
a ch.f. of the same form (with a/n C R and K/n d.f. up to a multipli¬ 
cative constant). In fact 

a. The t.d. family belongs to that of limit laws of the Central Limit 
Problem. 

For, on the one hand, the uan condition for independent and identically 
distributed r.v.’s X n k which figure in the definition of i.d. laws becomes 
convergence of their common law to the degenerate at 0 , that is,/ n —> 1 ； 
on the other hand, 

b. Ify for every n y f= f n n where /„ is a ch.f” then / n —> 1 ; and ， more- 
over^f 7^ 0 . 

Proof. Since |/| ^ 1, we have |/ n | 2 = |/| 2/n —> g with g(u) = 0 
or 1 according as /(«) = 0 or /(«) 9^ 0 . Since / is continuous and 
/( 0 ) = 1 , there exists a neighborhood of the origin where |/(«) | > 0 
and, hence, g{u) = 1 , so that g is continuous in this neighborhood. 
Thus, the sequence \ f n | 2 of ch.f.’s converges to a function g continuous 
at the origin, the continuity theorem applies, and ^ is a ch.f. Therefore, 
g is continuous on R with 《 (0) = 1 and, since it takes at most two values 
0 and 1, it reduces to 1. Consequently, / 〆 0 ， log/ exists and 

五 log/ 

is finite, and f n = e n —> 1 . The proposition is proved. 

We shall see later that the family of limit laws of the problem coin¬ 
cides with the i.d. one. This explains the property below. 

A. Closure theorem. The i.d. family is closed under compositions 
and passages to the limit. 

Proof. If / and f are i.d. ch.f.’s ， then, for every », there exist ch.f.’s 

f n and f n such that / = / n n , f =/ ， n n , so that#' = {fnf'nY where 

jnj'n are ch.f/s, and the first assertion is proved. 

On the other hand, if a sequence f n of i.d. ch.f.’s converges to a ch.f. 

2 2 

f y then, for every integer m y |/„ [ m —> j/| m and, by the continuity 
2 " 1 

theorem, |/j m is a ch.f. Therefore, |/| 2 is an i.d. ch.f. and, hence, by 
b,/ 〆 0 . Since log/ exists and is finite, and 

/ 士 m *° g ~ fh, 

j n m = e m ― > e m = 

it follows that f llm is a ch.f., so that / is i.d* This concludes the proof. 
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The basic feature of i.d. laws (hence, as we shall see, of all the limit 
laws of the Central Limit Problem) is that they are constructed by 
means of Poisson type laws. This is made precise in the theorem below 
and explicited by the representation theorem which will follow. 

B. Structure theorem. A ch.f. is t.d. if, and only if t it is the limit 
of sequences oj products of Poisson type ch.f.'s. 

In other words, the class of i.d. laws coincides with the limit laws of 
sequences of sums of independent Poisson type r.v.’s. 

Proof. Products f n of Poisson type ch.f/s are defined by finite sums 
of the form 


log/ n («) = E \iua nk + 入„〆〆以 - 1)}, X n jfc 2 0 ， 
k 

so that the functions — log/ n are log of ch.f.’s (of the same kind) what- 

m 

ever be the fixed integer m and the f n are i.d.ch.f.’s. Thus, by A, if 
f n — f ch.f., then / is i.d. This proves the “if” assertion. 

Conversely, if / is i.d., then log / exists and is finite and 


l 

»(/" 


1) — log/, / n («) 


1) dF n {x) 


where F n are d.f.’s. By taking Riemann-Stieltjes sums which approx¬ 
imate / 1/n («) — 1 by less than l/» 2 ，the “only if” assertion follows, 
and the proof is terminated. 


In what precedes, ^ n («) = J (e lux — \)ndF n {x) —> log/(«) and yp n is it¬ 
self log of an i.d.ch.f. Since Var (nF n ) =»—>«, brutal interchange 
of integration and passage to the limit is excluded. However, the in¬ 
tegral inequality in 13.4 yields Var ^ ^ < 00 with d^ n (x) = (x 2 /\ + 
^ndFnix) so that the weak compactness theorem applies. But the 
integrand for d^ n (x) is undetermined at a; = 0, and we have to modify 
it. This leads to the 少 -functions below: 

Unless otherwise stated, 少 ， with or without affixes, will denote a func¬ 
tion defined on R by \ 


iua + 


tux \ 1 + x 2 
1 4 - x 2 ) x 2 


d^{x) 


where a^. R and 少 denoting a d.f. — up to a multiplicative constant, 
with 伞 ( 一 《) = 0; the corresponding 少， a ， 少 will have same affixes if 
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any. The value of the integrand at x = 0, defined by continuity, is 

-« 2 / 2 . ' 

c. Every is an i.d. ch.f. 

Proof. We use repeatedly the fact that the class of log’s of ch.f.’s is 
closed under additions. 

The integrand is bounded in x and continuous in u (or x) for every fixed 
x (or u). It follows that the integral is continuous in u and is limit of 
Riemann-Stieltjes sums of the form 

Z \iua nk + \ nk (e iubni - 1)} 

k 

where 

1 + ^nk 2 r ^nk , 

入 nA = o x ritk^l)y a nk = 一 ^nk ~ 9 * ^nk = ^nky 

Xnk 1 + X nk 2 


we can and do take all 0 . Since every nonvanishing summand 

is log of a (Poisson type) ch.f., the sums are log’s of ch.f.’s, and so is the 
integral according to the continuity theorem. Since iua is log of a 
ch.f., so is ^ and, hence, so is every 女 /n corresponding to a/n C R and 
^/n — d.f. up to a multiplicative constant. The assertion is proved. 

Remark. If f x 2 d^{x) < oo, then 


iua + I (e iU2 


where 


— tux) — dK{x) 
XT 


a = a J*x d^(x) C R, dK{x) = (1 + x 2 ) d^(x) t 


and the i.d. ch.f. 〆 has for first moment a and for variance Var K < <x> 
(take the first two derivatives at « = 0 ). Conversely, if an i.d. ch.f. 

〆 has second (hence first) finite moment, then J^^{x) < oo (take 

the second symmetric derivative at u = 0). Thus, the family of all 
limit laws in the case of bounded variances coincides with the sub¬ 
family of i.d. laws with finite second moments. 

We establish now two properties of functions ^ corresponding to the 
unicity and continuity theorems for ch.f.’s. They will be reduced to 
these theorems by making correspond to functions 沴 functions <p and 
伞 , with same affixes as 少 if any. We define 沪 on by 


<piu)= 沴 (《)— 


a 少(《 + A) + — h) 
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We have, upon replacing 沴 by its defining relation and interchanging 
the integrations, 


<pW) 


with 


Since 


： ri/^ (i - 


i+x 2 

cos hx) --- f dh 


sin A 1 + 






y f y 
.2 


2 


d^. 


岭) -£0 -〒） 


where c' and c" are independent of x Ry it follows that 伞 is non¬ 
decreasing on R with , 


and 


c' Var ^ ^ Var $ ^ c" Var f < » 


C. Unicity theorem. There is a one-to-one correspondence between 
junctions rp and couples (a, ^). 

For this reason we shall sometimes write ^ = (a, I). 

Proof. By definition, every couple (a, I) determines a function 沴 . 
Conversely, if ^ is given, then, by the foregoing considerations, ^ de¬ 
termines a function <p which is a ch.f. (up to a constant factor). By 
the inversion formula for ch.f/s, <p determines 伞 and, in its turn, $ de¬ 
termines I; furthermore, ^ and ^ determine a, which completes the 
proof. 

C 

D. Convergence theorem. If a n a and then > \f/. 

# w C 

Conversely y ifyf/ n —> g continuous at the origin, then a n a and — 少 
such that g = yp = (a, ^). 


Proof. The first assertion follows at once by the Helly-Bray theorem. 
As for the converse, since the sequence of i.d. ch.f.’s converges to 
e s continuous at the origin, this convergence is uniform in every finite 
interval and, by 23.1b and A, e 8 is an i.d. ch.f. with e 8 ^ 0. Hence, 
g is finite and continuous on R t the sequence 少 n converges to g uni¬ 
formly on every finite interval, and 

/ 、 / n f 1 g(« + h) + g(u - h) 

<Pn(“）— gW - I --- 


dh 
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continuous on R. In particular, 

Var = 

so that variations of the 伞 n are uniformly bounded. Thus, the con¬ 
tinuity theorem applies to the sequence <p ni and there exists a nonde¬ 
creasing function $ of bounded variation on R such that, upon applying 
the Helly-Bray theorem, at every continuity point x of 伞 as well as 


沪 n ⑼ 


L 


1 gih) + g{~h) 
——1 


dh < oo, 


for x 


■ 00 % 


tua. 


沴 《 ⑷ 


~Sif ux 


giM) - /(〆 

iua. 


,iux 


sin j 、 

vl+/ 

y > 

1 y 2 

sin A 

i +/ 


/ 

:orem, 


tux 、 

\i + ^ 2 

1 + > 

)X 2 

• - ^ 

1 + x 2 

1 


d^ n 


+ 


This terminates the proof. 


E. Representation theorem. The family of i.d. ch.f.'s coincides 
with the family of ch.f.’s of the form e^. 

Proof. According to 23.1c, every is an i.d. ch.f. Conversely if, 
for every »,/ == / n n where f n is a ch.f. corresponding to a d.f. F ni then, 
upon applying the preceding convergence theorem, we obtain 


log/(«) = lim »(/ 1/n («) — 1) = lim »(/„(«) — 1) = lim I (e iux — 1)» dF n 


lim 


( iu h 


nx 




dF n 




with 


lim \f/ n = some 屮 , 
x 2 

d^ n (x) =» n 




dF n {x) and — I. 


The theorem is proved. 
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23.2 The uan condition. The main computational difficulties arise 
in connection with the uan condition, and we have to investigate it in 
detail. We recall that given a sequence of sums 23 Xi* of independent 

k 

r.v.’s, the uan condition is that, for every c > 0 , 

max P[| X n k I ^ e] = max I dF n k —* 0. 

. le k 

a. The uan condition implies that 

maxl/xA^fcl— 0 , max f I x | r dF n k —> 0 , r > 0 , r > 0finite, 
k k J\x\<r 

Proof. The medians of a r.v. belong to any interval such that the 
pr. for the r.v. to be in the interval is greater than 1/2. Since under 
the uan condition min P[| X n k \ < e] > 1/2 whatever be e > 0 , pro- 

k 

vided » ^ sufficiently large, it follows that max | fiX n k \ < « for 

k 

n ^ »„ and the first assertion is proved. 

Under the same condition, by letting n <x> and then e — 0, we 
have 

max f I jf | r dF nk ^ e r + max f \ x \ r dF n k 
k J\x\<t k J€^\x\<t 


^ « r + r r max (* dF n k —> 0, 
k * l^« 

and the second assertion is proved. 

A. Uan criteria. The uan condition is equivalent to 


max 
k 


h 


0 dF n]G — 0 or max |/ n jk - 1 

+ k 


uniformly on every finite interval. 

Proof. Under the uan condition, by letting n <x> and then e 
we have 

max I —- ~ - dF nk ^ e 2 + max I dF n k —* 0 
J \ X 2 k J\x\^€ 


o, 


k 


and, for I « 丨刍 3 < oo ， 
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max |/«* ⑷ 一 1 I 

k 

^ max f (e iux - 1) dF nk + max f (e iux - 1) dF nk 

k J\x\<€ k 

^ bt 2 max | dF n k —* 0 . 
k J\x\^, 

r x 2 

Conversely, if max I -- dF n u —* 0, then, for every e > 0, 

k J 1 x 2 


max f dF nk ^ — 
k J|*|^€ 

and the uan condition holds. 


f e 2 C x 2 

— 0 , 


Since, upon replacing f n k{u) by J e tux dF nk and interchanging the in¬ 
tegrations, we have 




dF nk = max I e~ u (l - (R/nk(u)) du 

k J 0 


— 00 

^ I e~ u max |/„*(«) - \ \du, 

Jo k 

it follows, by the dominated convergence theorem, that max | f n k — 1 | 

k 

— 0 implies the uan condition, and the proof is complete. 

From now on y we fix a finite r > 0 and’ for every d.f. F y with or without 
affixes y we set 


M * I ◊ 


^ dF, F(x) = F(x + a), /(«) = J ^ 


with same affixes if any. 

We observe that | a | < r and that the “bar” does not mean ‘‘complex- 
conjugate.” 

Corollary 1. Under the uan condition ， max |/ n * — 1 | — ♦ 0 uni- 

k 

jormly on every finite interval. 

Since, by a, max | I ^ max | | x I dF n k —* 0, the r.v.’s X n k = 

k k J|*i<r 

X n k — a nk obey the uan condition, and the assertion follows by A. 
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Corollary 2. Under the uan condition^ given b < <x> y all log/ n *(«) 
exist and are finite for j « | 刍彡 and » 2 »6 sufficiently large ， and 

log/nfc(«) =/«*(«) — 1 + 9 nk \fnk{u) — 1 | 2 , I I ^ 1；' 

similarly for the /«*(«). 

This follows from A and log z = (z — 1)+| ^ | z — 11 2 for | z — 1 |< - 

From now on, if 彡 > 0 is given, then we take » ^ so that the 
foregoing relations hold. 

We are now in a position to establish the inequalities which will lead 
almost at once to the solution of the Central Limit Problem. 

B. Central inequalities. Under the uan condition 、 for ” ^ 
sufficiently large ， there exist two finite positive constants ci = ci(b y r) and 
c 2 = C 2 {b i t) such that 

Cl max \fnk{u) — 1 I ^ f —~-zdVnk ^ c 2 f I log I fnk(u) \ \ du. 
lulSb' J \ X 2 Jo 

The inequalities follow at once, upon applying a, from two inequalities, 
valid for arbitrary r.v.’s, that we establish now. We shall use repeatedly 
the two relations 


M * I <r 


f g(x) dF{x + c) =f 咖 — c) dF(x )， 
{x — a) dF = a — dF = 丨 


Bi. Lower bound. There exists a finite positive number c\ = c\{a y b y r) 
such that 2 

c\ max |/(«) - 1 I ^ f i 丄 2 ^ - 

I u I ^6 «/ 1 + X 

Proof. Since, for | « | ^ ^ < °o, 

=f (^ iu(x-a) — \) dF ^ 2 f dF b f (x — a) dF 
J ~ J\x\^r J\x\<r 

b 2 r „ 

+ — I [X - a) 2 dF 
2 J\x\<T 

=(2+ _l 严 (…) w 


{pc — d) dF 
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where 


and 


L 


.... ( T+ 卜 1 ) 2 

I * I 之 r (r — \ a |) 2 


f - 


(x - ay 


dF 


J|*|<r 

it follows that 

where 


{x — a) 2 dF ^ (1 + ( r -j- j ^ |) 2 } J* - 


{x - a ) 2 
(x - a ) 2 


I * I <T 1 + (V — a): 


dF ， 


Cl J 1 + x 2 


and the asserted inequality is proved. 

Under the uan assertion, for n sufficiently large, we have, accord¬ 
ing to a, I a I 〈蓋， and we can take for q = ci{b t r) the value of c x 

obtained upon replacing | a \ by ^. This proves the left-hand side 
central inequality. 

B2. Upper bound. For r > | ^ |> M ^ median of F y there exists a finite 
positive number c 2 = c(n, b y r) such that 

f xr ^ dT - {(1 -1/(«) 1 v«. 

Iff{u) 9^ Ofor j u I ^ b t then 1 — |/(«) | 2 can be replaced by 2| log |/(«) | |. 


Proof. On account of the elementary inequality 

1 - |/| 2 ^ - log |/| 2 = 2| log I/I I, 

the second assertion follows from the first one. To prove the first as¬ 
sertion, we shall use the symmetrization method and denote by F* the 
d.f. of the symmetrized r.v. X — X' where X and X' are independent 
and identically distributed, so that the corresponding ch.f./* = j/| 2 . 
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From the elementary inequality 
/ sin bx\ \ x 2 

- ~~ ^2 ~ — > 0 ， x C R ， 

and the relation (obtained upon interchanging the integrations) 
J * (1 — |/(«) | 2 ) du =^|^* (1 — cos ux) dF 9 

r( sin bx\ 1 + x 2 x 2 


it follows that 


'\l - I /(«) I 2 ) du ^ bc{b) 


、•/ j \* I , 、” / I f == N 7 J j + ^2 

We pass now from F* to P*, the d.f. of X — n y and set 

/(/) = P[|X- M | ^4 q\t) = P[| X 8 1 ^ /], t G [0, +«), 

so that 】 upon applying the weak symmetrization lemma (which says 
that <f ^ 2q 3 ) and integrating by parts, we obtain 

r x 2 / •* / 2 r* / / 2 \ 

(2) TT ?^ = f 0 ^ HyT ?) 


r x 2 / •* / 2 r* / / 2 \ 

(2) TT ?^ = f 0 

“p (〜(士 ) = 2 JV^，' 

Now, we pass from P 1 to P. From the elementary inequality 
(x — a) 2 ^ (x - n) 2 + 2(m - a)(x - a), 

it follows that 


『 1* |o 


and, hence, 


{x-a) 2 dF^ C (^- M )VF+2(r+ |m|)| f ( x - a ) dF 

J\X\<T |J|*|<r 

S f (x — fi) 2 dF + 2r(r + j M |) f* ^ 

J\X\<T J\X\^T 


—^ dT=(dF^C ( x -a) 2 dF-^f dF 

+ a «/ 1 -|- (^X 一 U) J\ X I <r J\x\^.t 

^ f {x — ti) 2 dF {1 + 2r(r + I m |)} f dF. 
J\ 4 <T « |名 f 
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Since 


'\X\<T 


(x — n ) 2 JF ^ {1 + (t + I ^ |) 2 } f — 

J\x\<t\ 


(x - m ) 2 
+ (^ - m ) 5 


u + G + UI) 2 } 

J 1 + X 


f 心 i± (疒 l,l) 2 f 」 1: ，) 2 dF 

(t — \ H |) 2 J|*|>r 1 + (^ — fl ) 2 


it follows that 


where 


^ i + 0~ + 丨 m I) 2 r x 2 

- （ T - 丨 M I) 2 J \ X 2 

r x 2 ^ r ^ 

J l+x 2 - J l+x 2 


dF 、 


， = f ' (M ， T)={1 + (T+|M|) 2 } | 1 + i. ^^j^ 

Together, the inequalities (1), (2), and ⑶ yield the inequality 




dT^ c 2 j\l - |/(«) I 2 ) du 


with c 2 = 7 —— , and the proof is concluded. 
bc{b) 

Under the uan condition, for n ^ n T sufficiently large, I /x I < - and 

2 

we can take for = ^ 2 (^ M> t) the value of C 2 obtained upon replacing 
I M I by • This proves the right-hand side central inequality, 

a* 

23.3 Central Limit Theorem. We are ready for the solution of the 
Central Limit Problem and can follow the same approach as in the case 
of bounded variances, since 

a. Boundedness lemma. Under the uan condition，if ]^\f n k \ — |/| 

k 

continuous^ then there exists a finite constant f > 0 such that 



1 + ^ 


dT n k $ f < 00 . 
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Proof. It suffices to prove the assertion for n sufficiently large so 
that, by 23.2A, Corollary 2, all log/ n * exist and are finite. Let b > Q 
be sufficiently small so that, for \u \ ^ b y |/(«) | > 0, and log |/(«) | 

exists and is finite. Since |/| 2 is a ch.f., Z) l°g \ fnk | —* log | / 1 uni- 

k 

formly on [—b i + 々 ], and, by the right-hand side central inequality, 

E^2 dVnk ^ -C 2 EJ* log !/„*(«) I du 


The assertion follows. 


log|/(«) I 


du < oo. 


b. Comparison lemma. Under the uan condition, if there exists a con¬ 
stant c such that whatever be n 


then _ _ 

Z) Uog/«jfc(«) — （/«*(«) — 1)} — ♦ 0 , « C 兄 

k 

Proof. By 23.2A, Corollaries 1 and 2, max \ J n k — 1 丨 — 0 and, 

k 

given ^ > 0 , for j « j ^ b and n sufficiently large, 

log/n* =/n* — 1 + \fnk — 1 | 2 > | 玢 n* | S 1 . 

By the left-hand side central inequality 

E |/n*(«) - 1 I ^ — Z f 丄 -dF nk S - < OO. 

* Cl c \ 

It follows that by taking 彡 > | « |, where u R \s arbitrarily fixed ， 

I Z {^g/„*(«) - (/«*(«) - 1)} I ^ E \ 7nk(u) - 1 I 2 


^ — max I/„*(«) — 1 I — 0 , 

C\ k 


and the theorem is proved. 

Since (omitting the subscripts) 


log/(«) — (/(«) — 1 ) = log/(«) — {iua+ l (e tux — 1 ) dF] 
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(e iux -\)dF = iu 


+ A 


+/(， - 1 - 為 


1 + x 2 

x 2 1 + Jf 2 


the sums which figure in the comparison lemma are 


where 


log U/nM - 沴 „(«) 

k 


r/ . iux \ 1 + j: 2 

☆„(«) = iua n + J - 1 - a ) — — d^ n {x) 

a n = ^2 \ a nk ~' j = ~~~' 2 ^nk(,x). 

k J l -\-X 2 Jfc 1 + 


with 


A. Central limit theorem. Let X n k be uan independent summands. 

1 。 The family of limit laws of sequences U coincides with the 

k 

family of i.d. lavas or ， equivalently’ with the family of laws with log of ch.f. 
沴 =(a, I) defined, by 

r/ . iux \ 1 + ^ 

^(«) = iua + J Q ⑽ 一 1 - 

where ad and 兔 is a d.j. up to a multiplicative constant. 

2 。 £(X) Xnk) —* <£(^0 with hg of ch.f. necessarily of the form 


\p = (a, i/ y and only if, 


% oc n —* a. 


where 


a n = ^2 { a nk ^^ ~2 ^^n( x ) = ^2y2 

and 

^nk = | X nki "^nk(x) = F n ]f{x a n 灸)， 

J\X\<T 

with r > 0 finite and arbitrarily fixed. 

Proof. Every i.d. law is a limit law of the Central Limit Problem. 
Conversely, if, under the uan condition, Il/n* f ch.f., then, on 
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account of the boundedness lemma, the comparison lemma applies and 
〆" 一 ♦ /• Thus, on the one hand, by the closure theorem for i.d. laws 
/ = 〆 is i.d. and 1° is proved. On the other hand, + and hence, 

by the convergence theorem for i.d. laws, ♦ I, a n — a, and the 
“only if” part of 2。 is proved. c 

Conversely, if a n —> a and —* so that 


Var = 



Var ^ < oo 


and the comparison lemma applies, then ^ hence II/«a ： 

k 

and the “if*’ part of 2° is proved. This terminates the proof. 

Extension. It may happen that under the uan condition, the sequence 
£(^3 X n k) does not converge, yet the sequence X n k — a n ) con- 

k 

verges for suitably chosen constants a n ; this is the situation in the 
Bernoulli case and, more generally, in the classical limit problem where 
X n k = Xk/K with b n = n or s n . Then II/«*(«) is replaced by 

k * 

e~ xuan IJ/nA：(«) and the boundedness lemma can still be used, since it 
k 

refers only to the moduli of products. On the other hand, the sums in 

the comparison lemma can be written log { e ~ ,Ma " JJ fnk («)} — 

k 

{ —iua n + ^ n («)} • Since —iua n + ^ n («) is still a 沴 -function, the Cen¬ 
tral Limit theorem remains valid, provided a n is replaced by a n — a nt 
and the theorem can be stated as follows: 


B. Extended central limit theorem. Let X n k be uan independent 
summands. 

1 。 The family of limit laws of sequences <£(^3 ^nk — coincides 

k 

with the family of i.d. laws. 

2° There exist constants a n such that the sequence XiA ： — a n ) 

C 

converges if t and only if ， —* some where 

•) = r+7 dTnk ' 

Then all admissible a n are of the form a n = a n — a - o(l) where a 
is an arbitrary finite number and oc n = 

k 

all possible limit laws have for log of ch.f. ^ = (a, 承 ） • 


泛 nfc + ^ i I ^2 »A:|> dyid 






[Sec. 23) 


CENTRAL LIMIT PROBLEM 


323 


23.4 Central convergence criterion. The convergence criterion 22.2k 
2 ° is expressed in terms of expressions twice removed from the primary- 
datum — the d.f/s of the summands, and the probabilistic meaning of 
these expressions is somewhat hidden. We transform it by unpleasant 
but elementary computations as follows: 

A. Central convergence criterion. If X n k are uan independent 
summands，then 

Ilfnk — / = 〆 ，沴 =(a, *), 

k 

if、and only if ， 


(i) at every continuity point x 7 ^ 0 of'if 


Z F nk (x) 


1 +/ 


d^i for x < 0, 


Z{1 


1 +/ 


d^! for ^ > 0 


(ii) as n — 泊 and then € —> 0 


Z 々 dF nk 
a ； U 丨 I <« 


(L 产 ) 2 


^(+0) - 1(-0) 


(iii) for a fixed r > 0 such that 土 t are continuity points of ^ 

y^. C x dF n k ~* a + f x - C — d^i. 

^ J\x\<T J\x\<r j\ x\^T X 

The iterated limit in (ii) is the generalized iterated limit lim lim n . 

« —♦ 0 

Proof. We have to prove that the three stated conditions are equiva¬ 
lent to 

c — 

(C) % — f with d^! n {x) = --- Z dT nk 

, I +X 2 k 

and 

(CO Z \ a nk + \-——^dT nk ) — a ， with a nk = l x dF nky 

k J \ 4- X J\x\<r 


^ a n fc). 
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1 ° Let x be continuity points of I. It is readily seen that condition 
(C) can be written as follows: 


^ n (x) —> for x < 0 , 

^n(+°°) — ♦ ^(+°°) — ^( x ) for > 0 

and, as » —> oo and then € —> 0 , 

^n(+c) — ^«( — «) *(+0) — *(—0). 

It follows, upon replacing by its defining expression and applying 
the Helly-Bray theorem, that (C) is equivalent to 

/•* 1 + V 2 

(CO Z 卩 M 一 — for ^ <0, 

k y 

/•* l + V 2 

Z {1 - KM] — for ^>0 

k y 

and 

(C 2 ) … (+0) - 0) 

as n —* <x> and then c — > 0 . 

Let a n = max I I x I dF n k so that | a n k | ^ ♦ 0 . Since 

k J\x\<r 

Z F nk {x - ^ z ^nkix) ^ Z F nk (x + a n ), 

k k k 

and the continuity points x of ^ are continuity points of the integrals 
in (Ci), it follows at once that the first parts of (i) and (Q) are equiva¬ 
lent; similarly for the second parts. Thus (Ci) is equivalent to (i). 

2° Since 

rb I <^ 2 dTnk - I <. rr^ dTnk - . <.^ 2 dTnky 

condition (C 2 ) is equivalent to 




l<« 


X 2 dTnk —> ^(+0) — f(—0) as 


n 


00 and then c 


0 . 
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But, on account of (i), as » —> oo and then c —> 0, 


I H — ^2 f (x — a n k) 2 dFnk I 

左 J| * I <• ^ J| *. I <€ 

J' + J* ^ (x — a„k) 2 dF^ < 2X 


W>« |i|<« 


(x — a n k) 2 dFnk < 2H 

k 


(x — ank) 2 dFnk-^0 


5 «： |i|^ 2 € 


and, since a n —* 0, we have, for e < r, 


?Lj x ~ - )2 ^-? \L<f dF -~ K1 


^{L\ X \<r XdFn ^ ~ ^ ank2 L\j Fnk 


^ (ra n + a n 2 ) I dF n k —* 0. 

A ： *IS« 

Therefore, under (i) or its equivalent (Q), condition (C 2 ) is equivalent 
to (ii). Thus, condition (C) is equivalent to (i) and (ii). 

3° It remains to prove that, under (C) or its equivalent (i) and (ii), 
condition (C’）is equivalent to (iii). Since 




^L<r XdKk ~^L<r XT^ dKk + ?X ， „ ， TT^^ 


'l -ISr 1 + x 2 


and, ±r being continuity points of we have, by the Helly-Bray 
theorem, 

z r ― $ ^ nk = f x ^n —> r x 押 

fc J\ X \ <T \ X J\x\<T J\ x\<r 

Z f 2 ' 2 d ^ nk = f -d^f n f -d^y 

JI x I ^ r 1 X z J\x\^T x J\x\^T x 

it suffices to prove that E | x dT n k —* 0. This assertion follows 

k J\x\<r 

from the fact that a„ —* 0 and 土 t, being continuity points of I, are 
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continuity points of integrals in (i), so that, by (i), 




dT n k 




Z f (x - ^nk) dF nk 

^.J\X\<T 


Z - a nk ) dF nk 

k I J Ix-^atik I <r 


lil <r 


(•V — (Ink) : 


=H f dFnk + (r + a n ) f dF nk —* 0. 

k ^ \x\^t ^T^\x\<T+a n 


This terminates the proof. 


Remark 1. In the course of the proof, it was found that condition 
(i) can be written with T n k instead of F n k and condition (ii; is equiva¬ 
lent to 


(HO 




X 2 dF nk 


l<« 


— 1(+0、一 f (-0) 


as » —> oo and then e —> 0. 


Remark 2. In conditions (ii) or (ii’)，the passages to the limit can 
be taken indifferently to be lim lim sup or lim lim inf, instead of 

«—♦ 0 n «—♦ 0 n 

the generalized iterated limit; we leave the verification to the reader. 

Upon using the extended Central Limit theorem, the central con¬ 
vergence criterion extends at once to sums with variable origin, as 
follows: 


B. Extended central convergence criterion. If X n k are uan in¬ 
dependent summands，then there exist constants a n such that e~ man II/nfc(«) 

〆 ⑷ where \f/ = (a, i/y and only if 、 conditions (i) and (ii) of the 
central convergence criterion hold. Then the admissible a n are of the form 

a n = f x dF n k — a — f x 种 + f ~ o(l) 

^ J\x \ <r */| a ： I <r •/1 a ； r X 

where 士 r are fixed continuity points of 兔 . 

This criterion implies properties of min X n k and max X n k- In fact, 

k k 

it takes then a more intuitive form, as follows: 
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C. Extrema criterion. Let X n k be uan independent summands’ and 
let X n k = X nk or 0 according as | X nk | < t or \ X nk | ^ c. 

The sequence <£(^3 ^nk — a n ) converges for suitable constants a n if 、 

and only if 、 the sequences £(min X nk ) y £(max X nk ) and Z <r 2 X nk * con- 

k k k 

verge as n —* <x and then c —> 0 . 

More precisely ，£(Z X nk - a n ) £(X) with £(X) necessarily an 

k 

i.d. law (a, //, and only if 、 as n —* <x> and then c —> 0 , 

Z ^X n1 ; -> i(+o) -i(-o) 

and 

<C(min X nk ) —> £(Y)y £(max X nk ) —> £(Z) 

k k 

with 


Fy(x) = 1 — e~ L ^ or 1 and Fz{x) = 0 or e L< ' x \ according as x <0 
or x > 0 y 
where 


L{x )= 


r x I + y 2 
J-oo y 2 


d^(y), x < 0; 


L{x) = x>0. 


人 .n 

Proof. Let G n be the d.f. of min X n k y so that 1 — G n = II (1 — Fnk)- 

k Hen • . A; 3=1 

For every fixed x > 0, F n k(x) —> 1 uniformly in k and, hence, G n (x) —* 
1. For every fixed x < 0, F n k(x) —* 0 uniformly in k and, hence, for 
n sufficiently large, 

log (1 - G n (x)) = X) log (1 - F nk (x)) = - (1 + o(l)) X) F nk (x). 

k k 


Therefore, the assertion relative to Fy is equivalent to the first part of 
condition (i) of the central convergence criterion; similarly for the 
assertion relative to Fz. The theorem follows. 

23.5 Normal, Poisson, and degenerate convergence. We apply now 
the central convergence criterion to the three first-discovered limit 
types. We set 


a n k(r)= 


'!*!<»■ 


X dF n ki <^nk 


! w= r. 


x‘ 


J\ X I <r 


dFnk 


(L 


X 

\<T 


dFnk 


1。 A normal law SJl(a, a 2 ) corresponds to \p(u) = iua — u 2 y that 
i s , 少 =( a ，f) where = 0 or (r 2 according as x < 0 or x > 0 . 
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Normal convergence criterion. If X n k are independent summands ， 
then ^ for every e > 0, 

<C(Z) X n k) 9l(a, <r 2 ) and max P[| X n k | ^ c] —> 0 

k k 

i/y and only if y for every c > 0 and a t > 0 y 

(i) ZP[\X nk \ ^ 6] 0 

k 

(ii) Z ^k 2 (r) a 2 , Z a nk {r) a. 

k k 

Proof. We have, under (i), 

max P[| X nk \^e]^Z P[\ X nk | 2 e ] — 0. 

k k 


Furthermore, always under (i), if c < r, then 




X 2 dFnk + 2r X) 

^\x\<T fc 


! 盔 | a; I <r 


xdF nk 


^ 3r 2 


! 彡 I 0 ： |<r 


dFnk 


and the same is true of c > r; it suffices to interchange c and r in the 
foregoing chain of inequalities. Upon taking into account these conse¬ 
quences of (i), the foregoing criterion follows from the central con¬ 
vergence criterion applied to the limit law 91 (a, a 2 ). 


Corollary. If X n k are independent summands and the sequence 
£(23 X n k) converges^ then the limit law is normal and the uan condition 

k p 

is satisfied if, and only if, max \ X n k | 0. 

k 

Upon setting p n k = P[| X n k | ^ «]> it suffices to observe that, because 
of the independence of the summands, 

P[max I X n k [ ^ «] = 1 — II (1 — Pnk)* 

k k 


For, upon applying the elementary inequality 

1 — exp [— 21 Pnk] ^ 1 — II (1 — Pnk) ^ YL Pnki 

k k k 


it follows that the asserted condition is equivalent to condition (i) of 
the above criterion. 
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2。 The Poisson law (P(X) corresponds to 少 (《) = X(^ ,M — 1) and, 

consequently, to 屮 =( 全 , f) with = 0 or ^ according as x < 1 

or x > 1. Upon applying the central convergence criterion and observ¬ 
ing that the condition relative to the <T n k 2 (^) reduces exactly as in the 
normal case, we obtain the 

Poisson convergence criterion. If X n k are uan independent sum- 
mandsy then £(X) X n k) —* (P(X) i/ y and only if，for every e C (0, 1) and 

^ r C (0, 1), 






dF n k 0 and 




dF n k ― > X 


52 ^nk 2 (r) —> 0 and 泛 nfc(T) — 0. 


3° The degenerate law <£(0) can be considered as a degenerate nor¬ 
mal 91(0 ， 0) so that the normal convergence criterion reduces to the 

Degenerate convergence criterion. If X n k are independent sum- 
mandsy then <£(X) X n k、— £(0) and the uan condition is satisfied if 、 and 

k 

only ify for every c > 0 and a t > 0 


?X*ig« 


dFnk 


H <^nk 2 (r) — 0 ， X) a nk{r) 0. 

k k 


Corollary 1. If Xk are independent summands and b n | oo, then 
o //, and only if,fo r every e > 0 

) zf dF k — Q 

J| * eb n 

T^z{ f x 2 dF k -(f ^ 八丫}—。， 

bn k IJ|*|<6„ V| * I <bn / J 


— X) I x dFk —*■ 0 . 

b n k J\x\<bn 


(ii) 
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Because of the above criterion, taking r = 1 and observing that for 


X n k = 〒 、 F n k(x) = Fk{b n x) y it remains only to prove that <£ > 

<£(0) implies the uan condition. This follows from the fact that 
P [ ^1 < € 1 > 1 — 5, for k ^ n.x sufficiently large, implies that, for 


LUnl 

» > 


< « > 1 — 5, for k ^ » <ti sufficiently large, implies that, for 


< 2 , 

I K 




^ p 


— 1 S n - 
b n b n b n - 

Sn\ . iri^ 


< € 

r ^«-i 

_ 

-彡 n — 1 


< It 


<6 ^ 1 - 25. 


Remark. For the degenerate convergence criterion, (ii) and (i) with 
€ = r imply that <£(X) U — <£(0). For, as in 21. 2A, by Tchebichev 

k 

inequality, (ii) implies that £(X) X n k) —> <£ ⑼ and then, by 21.1b, 

k 

(i) implies that £(X) X nk ) —> <C ⑼. 

k 

In particular, in Corollary 1, we may take 6=1. Thus, for b n = », we 
have 


Corollary 2. If Xk are independent summands，then £ ^ 

if，and only if 、 

(i) E f dF h — 0, 

k «^lxl>n 

(ii) I f x 2 dFk - ( f x dFk \ | —> 0, 

^ k ^ J|xl<n N ^ lx I <n / J 


増 


(in) -E x dF k — Q. 

n k J\x\<n 

This is the classical degenerate convergence criterion. 

The reader is invited to specialize 23.4C to the three foregoing cases. 

In particular, it implies the corollary to the normal convergence criterion. 
As for the Poisson case, dL{x) = 0 or X according as x 〆 1 or x = 1 so 
that 

7/<£(21 X n k) —> £(X) y then £(X) = (P(X) if and only if £.tmn{X n k) —* 
£(0) and <£(max X n k) —*■ £(0, 1) with two values 0 and 1 only oj pr. e 一、 

' k ' 

and 1 — respectively. 
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*§ 24. NORMED SUMS 


*24.1 The problem. Let — a n be normed sums with d.f. G n and 

On 

ch-f, g ny where S n = Xk are consecutive sums of independent r.v.’s 

Xk with d.f/s Fk and ch.f.’s/*, and where a n} l/ n > 0 are finite numbers; 
thus 

A=i 

In what follows k runs over 1, •••,»;» = 1, 

If the X n k = Xk/K obey the uan condition: 

/• ^ 

max P[| X k I ^ tb n ] —> 0 or max | -- - dF k {x) 0 

k k J K 2 + x 2 


or max \f k 


0 - 


then, according to the extended Central Limit Theorem, all possible 


limit laws of sequences 


a n of normed sums form a family 91 of 


i.d. laws, and the extended central convergence criterion applies with 

FnM = F k (Kx)- 

However, in the case of normed sums, new problems arise. 

1。 Given a sequence X n of independent r.v.’s, find whether there 
exist sequences a n and b n > 0 such that the uan condition (for the 
Xk/K) is satisfied and g n f ch.f., necessarily of the form 〆 with 
yf/ = (a, f); and if such sequences exist, then characterize them. 

2 ° Characterize the family 91; in other words, characterize those 
i.d. ch.f.’s 〆 and the corresponding functions ^ which represent limit 
laws of normed sums obeying the uan condition. 

But on the one hand, according to the convergence of types theorem, 
there always exist sequences a n and i> n > 0 such that tlie limit laws of 

— a n are degenerate and, on the other hand, all degenerate laws 

belong to 91: e lua = (e tualn ) n . Thus, whenever convenient, we can and do 
exclude degenerate limit laws from our considerations. 


1 / gn f nondegenerate chj” then the uan condition for the Xk/b n 
implies that b n <x> and b n+ \/b n —^ 1. 
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Proof. We have 

Sn{u) = e~ tuan n/fc ( 厂 ) ~^ /(«) nondegenerate. 

If b n oo, then the sequence b n contains a bounded subsequence 

and, by the Bolzano-Weierstrass lemma, this subsequence contains 

another sequence b n > b finite as rt' «. Setting u n > = b n >u i the 

uan condition implies that for every 是 ， /*(«) ; /*(««，/〜，）—• 1 ; hence, 

/* = 1 and / = 1. This contradicts the nondegeneracy assumption so 

that, ab contrario, b n oo. 

• p . 

Since X n+ i/i> n+ i 0 , it follows by the law-equivalence lemma that 
the limit laws of the sequences — — a n and —-- a n+1 = — 

bn 彡 n+l 彡 n+1 

^Ca+1 _. * 

a n +i — - - are the same. Thus e~ tuan g n {b n 'u) /(«) as n' 

^n+l 

with b' n = bn/bn^i and / nondegenerate. It follows, by the corollary 
to the convergence of types theorem, that b n+ x/b n 1 . The proof is 
complete. 

*24.2 Norming sequences. We have at our disposal the necessary- 
tools to solve the problem of existence and determination of norming 
sequences a n and b n > 0. Given the summands, we know, according 
to the convergence of types theorem, that 1 ° all the limit laws belong — 
if they exist — to the positive type of one i.d. law and 2° it suffices to 
find one pair of such sequences. Furthermore, on account of the ex¬ 
tended convergence criterion (with X n k = Xk/K), 3° if there exists 
a limit i.d. positive type, then the a n are determined by the expression 
given there, 4° the uan condition is satisfied and g n e* if, and only if, 


TJJ7T7^ (x) -° and 〜 w 

where are defined on R by 


(D) Kk 




X dF k {x) 


with 士 r 〆 0 fixed continuity points of ^ (we shall see later that any 
r is admissible, so that we may set, say, r = 1). The theorem below 
completes the answer. As usual, the superscript “j” will denote the 
operation of symmetrization. 
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A. Norming theorem. There exist sequences b n such that 
& 一 a») £>(X) for suitable a n if ， and only if t there exists a ^ 

such that、upon setting in (D), b n = b' n > 0 determined by 


l n r x 


we have 


b'r? + 


dF k 8 {x) = f(+«). 


r x 2 


dF k {x) —> 0, 


^ i ^ 

Proof. The “if” assertion follows by taking normed sums -- a n . 

bn 

Because of the corollary to the convergence of types theorem and of 
the extended central convergence criterion, the “only if” assertion will 

follow by proving that if £ — a^j £(X) with ch.f. e^ y = 

(a, ^), then b' n /b n 1. 

Upon symmetrizing, the hypothesis becomes £(S n s /l> n ) — £(X") and 
the corresponding is defined by 

f (at) = f (at) + f (+oo) - ^(-x + 0). 


Thus where fn* are defined by 


n rb^c 

= Z 

A;=l ^ — oo 1 


bn + X 1 


dF k \x). 


Upon using 承 *(+«) = 2*(+«)， and (D) with b n replaced by b' ni it 
follows that 


云 1/ 


bn + X 2 


C x l 


y n 2 + 


dF k \x) 


On the other hand, since degenerate limit laws are excluded, does 
not reduce to a constant. Therefore, there exists an a > 0 such that 
25 = ^ s (a) — ^"( — a + 0 ) > 0 and, hence, for n ^ n a sufficiently 
large, 

n /• - habn 

E I - T —^ dF k \x) > 5 > 0. 

J-Ohn + X 1 
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It follows that 

0 I A n I = I ^ n 2 - b f n 2 I E 


x ‘ 


k^X J {bn 2 + X 2 ){b ' n 2 + ^ 2 ) 


dF k \ X ) 






bn 2 - b r n 2 I 


b n 2 -VaH'r? k 

I {bjb'nf - 1 1 
1 + aV/^n 2 


2 n r ^ab n 

L j b 

n •/ — aO n 




{b ' n 2 + ^ 2 ) 

.520 ， 


dF k \x) 


so that b n /b'n 1， and the proof is complete. 

*24.3 Characterization of 31. We characterize 31 by a decomposa- 
bility property and, then, we characterize the corresponding func¬ 
tions 

In order to define the decomposability property we prove 


a. If to a ch.f. f there corresponds a number f > 0 and a nondegenerate 
ch.f.jc such that 、 for every «，/(“）= J{cu\f c {u\ then f < 1. 

Proof. If f = 1， then f c = \. If f > 1 ， then, replacing repeatedly 

in the assumed relation « by - and |/ c | by 1, we have 


12 \f{u) I ^ /0 



^ - ^ lim 


0 


=/( 0 )= 


and / is degenerate, so that / c is degenerate. The assertion follows 
ab contrario. 


We say that a law and its ch.f. / are self-decomposable if, for every 
c C (0, 1), there exists a ch.f. f c such that, for every u,/(u) = f{cu)f c (u). 
Clearly, a degenerate ch.f. is self-decomposable and all its components 
f c are also degenerate. 

b. If f is self-decomposable^ then / 〆 0. 

Proof. If /(2a) = 0 and /(«) 〆 0 for 0 ^ < 2a y then f c (2a) = 0. 

Upon replacing / and A by a in 

l/c (/ + h) -/c (/)| 2 ^ 2{1 - (R/c(A)}, 

we obtain 

\/ c (a)\ 2 ^ 2{1 - (R 則 . 

This leads to a contradiction since, by letting r 1， we obtain 
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/ c (a)= 


/(a) 


1 and the inequality becomes 1^0. The assertion 


Aca) 
follows ab contrario. 

A. Self-decomposability criterion. A law belongs to 31 i/ y and 
only if y it is self-decomposable. 

Proof. A degenerate law certainly belongs to 91, so that it suffices to 
consider nondegenerate laws with ch.f,/. 

1 ° If / is self-decomposable, then let Xk(k =1， •••，《) be inde¬ 
pendent r.v/s, with ch.f. fk defined by 

、 f(ku) 

/*(«)=/，- ,(々《) 


Since/* 


0 


k 一 V 
~ 


A(k - \)u) 

1 uniformly in k and the ch.f. of — is given by 


IUk 


0 


n 


/(«)， 


the “if” assertion follows. 


2 ° Conversely, let / belong to 31. There exist normed sums 


S n 

T n 




with ch.f. g n such that, denoting by/* the ch.f. of summands Xk, 

gn(u) = e - 一 n/ k (f) - /(«) 

彡 n+1 


and, by 24.1b, b n 


bn 


1. Then, given c C (0, 1), we can 

b m 


m 


as n 


make correspond to every integer n an integer m < n such that 
and my n — 

(1) gn(u) 


oo. Since 


K 


-iu(a n —ca m ) 


k = 




n h (f) 


where g n (u) /(«), and the first bracket converges to f{cu) y it fol¬ 
lows that the ch.f. g m , n , whose values figure within the second bracket, 

converges to the continuous function f c defined by/ c («)= ’⑷ 


/M 


There¬ 


fore, by the continuity theorem,/ c is a ch.f., and the proof is concluded. 
Corollary. A self-decomposable ch.f.f and its components f c are i.d. 
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Proof. Since / belongs to 91,/ is i.d. On the other hand, upon taking 

. in 

for fk the ch.f. of r.v.’s Xk defined in 1° and m < n such that — c, 
we have n 


JL /w «\ n /u\ 

/iU)= E /k b^) k 3J k \n) 


k^i 、n m/ k^m^i 

The first product converges to j{cu ); the second one converges to/ c («). 

n 

Thus,/ c is ch.f. of the limit law of sums X X n k where the summands 

X n k — ~ obey the uan condition. Therefore,/ c is an id. ch.f., and the 
proof is concluded. 

We express now the self-decomposability criterion in terms of func¬ 
tions ^ which figure in the representation of the i.d. self-decomposable 
ch.f.’s. 

B. ^-criterion. Self-decomposable laws coincide with i.d. laws with 
functions ^ such that on ( —» } 0) and on (0, +<»)，their left and right 

1 尤2 

derivatives^ denoted indifferently by exist and - - ^(x) do not 

X 

increase. 

Proof. Because of the preceding corollary, the self-decomposability 
property of a ch.f./, necessarily of the form 〆， is as follows: for every 
c C (0, 1) the difference 少 c («)= 少 (《) — ^{cu) defines a 少 -function 
(a log of an i.d. ch.f.). 

Upon replacing x by c~ l x y we can write 


(1) yff{cu) = iu I fa (1 — c 2 ) 


1 + x 4 


d^{c~ l x) 


Thus 


n( . iux \ 1 + c~ 2 x 2 

Uu) = iua c + 一 1 - 


cNf(c~ l x). 


where a c is a finite number and 

1 + c^x 2 

(2) d^ c {x) = d^(x) - 一 2/ i • -- 种 ( 厂〜 ) ， ^ c (-oo) =0. 

c~ z (l + x z ) 
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Since is a difference of two ^-functions, its variation on R is bounded. 
It follows readily that 屮 c is a 少 -function if, and only if, is nondecreas¬ 
ing on R. Since 

(3) K+0) - ^ c (-0) = (1 - ^){^(+0) - ^(-0)} ^ 0, 

the self-decomposability property becomes d^ c (x) ^ 0 for every 
c C (0, 1) and x 〆 0 or, equivalently, on account of (2), for every 
c G (0, 1) and arbitrary x' < x'x" > 0, 


i 


d-^ciy) 




mc-b) go. 

c 2 y 2 


It remains to show that this last inequality implies and is implied by 
the one asserted In the theorem. 

If 

r eX 1 + y 2 

m = I — xCR } 

J+oo y 2 

then, by setting in (4) x' = ^~ h , x" = e x y c = e~ h t we obtain 


/W - /(^ ~ h)^ J(x + A) - J(x) or J(x) ^ 


J{x + h)^-Rx-h) 


Therefore, the nondecreasing finite function / on /? is convex (from 
above) and, consequently, J is continuous and its left and right deriva¬ 
tives J'{x) exist and do not increase on R. Since 


J{x + A) — J(x) 1 + ^ 2(z+eA) *(〆+*)— 伞 (〆) 


eX+2dh 


e x+h _ e x 


o^e 


it follows, letting A 0 and setting e 1 = y, that the left and right de- 

r y 2 . 

rivatives f'OO exist and that - f'OO do not increase on (0, «). 

J ' 

_ /•-«*! y 2 

Similarly, introducing J~(x) = j - - - d^{y) y we find that the 

same is true on (—«, 0). Thus (4) implies the asserted property of 
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Conversely, if this asserted property is true then, for every c C (0, 1 ) 
and x f < x n , x'x" > 0 


• Z 刪 

^ y 


r x "\+y 2 dy 

I - f 0)— 

J x> y y 

r x " 1 + C~ 2 y 2 , dy 

^ I — Zi —外，)一 

•4' C l y y 


i 




so that the inequality in (4) holds and the conclusion is reached. 


Remark. Since Poisson laws correspond to functions ^ discontinuous 
at some x 9^ 0, they do not belong to the family 31. This explains the 
isolation in which they remained as long as only limit laws of normed 
sums were considered. 


*24.4 Identically distributed summands and stable laws. The first 
family 31/ of limit laws to be investigated by P. Levy, was that of limit 

laws of normed sums — a n of independent and identically distributed 

bn 

summands Xk with an arbitrary common ch.f. / 0 . In other words, 
31/ is defined as the family of laws whose ch.f.’s/ are such that 

gn(u) = e~ iuan f Q n (f) — /(“)，« C 

Clearly, the uan condition is satisfied, so that 31/ C 31. The self- 
decomposability concept and the criteria for 31 are easily particularized 
for 31/, as follows; we exclude degenerate limit laws which, clearly, 
belong to Let a law and its ch.f. / be called stable if, for arbitrary 
b > 0 y b r > Q } there exist finite numbers a and b n > 0 such that 

/(々"《) = 户 a / (知 )/( 夕 “)， 《 G 兄 

• • b V . 

Upon replacing b"u by u and setting f = — , we obtain 

* * b" b" 

/(«) = e lu ^/(cuWu) =f{cu)f c {u) 


where 


Mu) = 
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The self-decomposability criterion for 31 becomes 

A. Stability criterion. A law belongs to 31/ //, and only it is 
stable. 


Proof. The “if*’ assertion follows from the fact that stability of 
/ implies, taking /o = /, that the ch.f. of is of the form f n {u )= 
e luan f{b n u) so that, norming *9 n with these quantities a n and b ni we have 

gn =/• 

Conversely, leaving out — to simplify the writing — factors of the 
form e lua y which does not restrict the generality, we have to prove that 


(3 


/(«), u C. R, implies that to arbitrary 々 > 0 , 々 ' > 0 , there 


corresponds b" > 0 such that/( 々 "《)= fipu) f{b'u). Since b n oo and 
為 w+i • K 

- > 1 , we can assign to every integer n integers m and m! such that 

- 

——» -- > b'. Then 

b n K 

’ (亨 . i ；) =/ 0 (荃 .£) 

and the right-hand side converges to while, according to 

the convergence of types theorem, there exists b" > 0 such that the 
left-hand side converges to f{b"u). The conclusion is reached. 

Thus, a stable law is self-decomposable and, moreover,/ c belongs to the 
positive type of/; in particular/ is an i.d. ch.f. 

The ^-criterion for 91 is easily transformed and, furthermore, the 
stable ch.f.’s are obtained in terms of elementary functions of analysis, 
as follows. 

B. A junction j is a stable ch.f . if t and only ij y either 

(i) logTXtt) = iau — « j r I 1 /V - tan — y 1 

^ u 2 J 


u 2 


log/(tt) = iau — b\u j 1 + /V ； — r ♦ — log u 


u ir 


with 


a? 0 , ^^ 0 , \c\^l, y G (0, 1) U (1, 2 ]. 
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We observe that y = 2 gives the normal laws and that real stable 
ch.f.’s are of the form e~ h \ U I T , 0 < 7 < 2 . 

Proof. If the asserted forms of / are ch.f.’s，then they are clearly 
stable. Thus, we have to prove that these forms are ch.f.’s and that 
stable ones are of this form. The first assertion will follow if we can 
determine functions ^ such that log/ = = («，％. 

Let / = 〆 be a stable ch.f., that is, for arbitrary b > Q and b' > 0 , 
there exist a and b" > 0 such that 


iua ^{bu) + yf/{b'u) = yf/{b"u). 

1° We follow the pattern of f-criterion’s proof ^with c = . Upon 

replacing \f/ by its representation in terms of a and the foregoing 
requirement reduces to 

1 + b 2 x 2 1 + b ,2 x 2 1 + b ,,2 x 2 

d^{bx) -\ --- d^{b'x )= 


b 2 


b' 2 


b" 2 


d^{b n x). 


Upon introducing the functions J and J 一 defined on R by 

1 + y 2 r~ eX l y 2 

J{x) = I 种 OO, J~(x) = — xCR, 

4 y J -oo y 

and setting ^ ^ = b' t ^ = b", this requirement becomes 

(1) {^(+0) - ^(-0)}(^ 2 + b ' 2 - b " 2 ) = 0 
and 

(2) J{x + A) + J(x + h') = J(x + h!\ 

J~{x + A) + J~{x + h') = J~(x + h!'\ xCRy 

where A, h! are arbitrary numbers and h" is a function of h and h!. 

Let 承 （+«) — ^(+0) > 0 so that J does not vanish. If, in the 

foregoing relation in /, we set repeatedly A' = A, it follows that, for 

arbitrary positive integers n and sn } 

% 

nj{x + A) = J(x -f- h n n )y snj{x h) = J{x + h", n ). 

Therefore, to every rational s > 0(j / > 0) there corresponds a number 
/(/')，such that, for every x, 

(3) sj(x) = J(x + /). 

Since J is continuous from the left and nondecreasing, with / ^ 0, 
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/(+ 00 ) = 0， it follows that /' 丄 / as 〆 T j ，so that J is continuous, (3) 
holds for irrational s as well, and 

as j To• 

Since / does not vanish, we can assume — by changing the origin if 
necessary — that > /⑼ ¥ 0. Then, setting J 0 = J/ /(0), it follows, by 

sm = 池 ? 則 = /( 〆 )，，/(/)=/(/ +，)， 

that 

/ 0 (/)/。(/') = /。(/ + /')，/，/' e 兄 

The only nonvanishing continuous solution of this functional equation, 
with /o(°°) = 0, is proportional to with 7 > 0. Therefore, setting 
y = e l and going back to the derivative ^(j) exists for _y > 0 and 

1 -h y 2 

- ^(y) = &'y~\ ^ ^ 0 , 

y 

taking into account the vanishing case. Since ^ is of bounded variation 
• r° _ . . 

on (0, +«)，it follows that I y l ~ y dy is finite for € > 0 and, hence, 

y < 2. Furthermore, replacing / in (2) by its above-found expression, 
we have 

P + = i ,,y , 0 <y <2. 

Similarly, with 厂 ： for y <0 


1 +/ 


y 


^ f (y) = -Ay\~ y \ ^ ^ 0 , 


with b y， + b ,y， = b " y， y hence = *y' (set b = b' = \). 

Therefore, on account of (1), either b 2 + b ,2 = b" 2 so that J and J~ 
vanish and/ is a normal ch.f., or 少 （ +0) — ^(― 0) = 0 and, for 〆0, 
f OO is given by the foregoing relations. 

2° According to what precedes, a stable ch.f. / is either normal or 
of the form 

. iux \ dx 

(1) log/(«) = iua + )3 


a 


iux 


0 


+ 矿 


l + W 卜 j 1+7 

,+a0 / iux \ dx 




iux 


* i+ x yx 1+ y 

If 0 < 7 < 1, then it is possible to take out of the bracket the term 
iux 


and, by modifying a, we obtain 


1 + 

⑵ log/(tt) = zW + (8 




+ (e iux - 1) dX 


X 


1+7 
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Let tt > 0. Setting ux = v and integrating along the closed contour 
formed by the positive halves of the real and imaginary axes and a 
circumference centered at the origin of radius r ^ it follows, by 
the Cauchy theorem, that 


where 


(^' wz - 1)-TZ7 = l«l^~" 27t r(-7), 


r(-7) = (e~ v - 1) — < 0. 


The first integral in ( 1 ) follows by taking the complex-conjugate of 
(3) and, for « < 0, log/(«) is obtained by taking the complex-conjugate 
of log/([ « [). Upon substituting in (2) and setting 


7T )3 一 j8 / 

—r(- 7 )(^ ^') cos - 7 , c = 

2 P + P 


so that b \ c\ ^ 1 , we obtain the asserted form (i) of log/(«). 

If 1 < 7 < 2, then we can take out of the bracket in (1) the term 


+ tux, and ( 2 ) is replaced by 


(4) log/(«) = iua" + ^J (e iui 




+， J 。 (产 - n 

Proceeding as above we obtain the same form (i) of log/(«). 

If 7 = 1, the foregoing modifications of the third term in the bracket 
in (1) are no more possible. But, for « > 0, 

r +a0 / . tux \ dx 

f +0 \ ~ 1 "" TT ^)^ 

广 cos ux — 1 r® / . ux \ dx 

Jo x 2 J+o\ 1 + X 2 / x 2 

7 r [ r +0 ° sin v r* dv 1 

= - u + iu lim{ I — ~t/v — I ■； ~ 

2 y 2 y(l + v 2 )l 

ir r tU sin v . t . f® /sin v 1 \ 

= --«-/« HmJ t 


7 r [ sin i 

—u tu lirn \ I — r- 

2 ， lo{J €U v 2 


u — iu lim 
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The limit of the second integral exists and is finite, and that of the first 
one is log u. The asserted form (ii) of log/(«) readily follows, and the 
conclusion is reached. 

24.5. Levy representation. This subsection may read immediately 
after 24.1 except for reformulations of results in the intervening subsec¬ 
tions. 

So far we used systematically Khintchine representation of i.d. ch.f.’s 
e* with ^ = (a, representing 

☆(«)= — + - 1 - 為)刪 


where a R and the Khintchine function ^ is bounded nondecreasing 
with 少（一 ①） = 0,^( + oo) < oo or, in terms of the measure which cor¬ 
responds biunivoquely to it and is also denoted by the Khintchine 
measured on R (that is, on the Borel field of R) y is bounded. ^ has no 
direct probabilistic meaning but presents definite technical advantages: 
It permits a simple description of the i.d. family with = (a,^),a C Ry 
伞 bounded measure on R y as well as a simple description of convergence 
of i.d. laws: yf/ n = (a n , ^n) ^ = (a, if and only if a n a, 伞 n 少 . 

In fact, “Levy representation” below was the initial one and is central 
to and born from P. Levy probabilistic analysis of decomposable proc¬ 
esses (§41). 

Let barred integral sign mean that the origin is excluded from the 
interval of integration and, as usual, we omit its endpoints when they 
are —① and + ① • 

P. Levy representation of i.d, ch.f/s with \f/ = (a, j8 2 , L) is given by 


yp{u) = tau — « 2 -f- — 1 — dL{x) 


where a, /3 C ^ and the Levy function L defined on R — {Oj is nonde¬ 
creasing on ( 一°°， 0) and on (0, + °°) with 乙（土 ①） =0 and 


f y 2 dL{y) < oo for some hence every finite ^ > 0. The corresponding 

j . 

Levy measure L on R — {0} is bounded outside every neighborhood of 
the origin but may be infinite or\ R — {0}. 

The somewhat involved characterization of Levy function explains 
why Khintchine representation is frequently favored despite its lack 
of direct probabilistic meaning. 
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The following correspondence is immediate: 

a. Correspondence lemma. There is a one-to-one correspondence be¬ 
tween Levy and Khintchine representations. 

It is given by /3 2 = 伞 （ +0) — 伞 （一 0) and 
(1) dL{x) = - ^ — dir(x) y x 〆 0, 

or y more precisely、with x > 0 y 

(D L(-x) = f ^(y), L(x) = / 

^-oo y ^-oo y 

and y conversely y 

(1") n-^)= JJ 外 ) = f +m dL{y) + 

The continuity sets C(L) and C(^) are the same on R — {0}. 

A. I.D. CONVERGENCE CRITERION. 

'pn = (oCny 卢 n 2 ’ L n ) = (a, j8 2 , L) 

if and only if 

(i) L n ^LonR- {0} 

(ii) y- dL n {y) + /3„ 2 — /8 2 as w — 00 then 0 < a; ^ 0 

(iii) cx n ^a 

Proof. Since \f/ n = (a„, ^ n ) ^ = (a, if and only if a„ and 

A it suffices to prove that A ^ <=> (i) and (ii) hold. 

We use a and Helly-Bray lemma and theorem without further com¬ 
ment. Let a; > 0. 

Let A Clearly (i) follows. Since for 土 a; G C(^) 

I ^ y 2 dLn{x) + j3n 2 = 伞 n(x)- 承 “- X) 

(ii) follows as w 4 <» then 0 < 0 hence without the above restriction 

on 土 a; since ^(at) — ^(— x ) is monotone in a;. 

Conversely, let (i) and (ii) hold. Clearly — -v)— 伞 (— x) for 
— a; G CiL). For 0 < € < x C C(Z,), from 
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^ x) = il rirj^ Ln(y) + £ T^7 dLn{y) 

+ ^n 2 + 1 dL n (j) 

it follows that, as w ^ <» then c ^ 0, 

tW (-0) + 沪 + *(a;) - f (0) = ^{x). 

The same is true for a; = + 00 so that ^„(4 - <») 00 )• Thus 

f „ A f and the proof is terminated. 

* Reformulations. Levy representation is visible in the main results and 
also in the proofs in the preceding subsections: 

1. Extrema criterion. Its statement in 23.4C is already in terms of 
Levy function L and of /S 2 = f(+0) — ^(—0) of the i.d. limit law. 

2. Extended central convergence criterion. This most impor¬ 
tant result of the section 23.4B is to be reformulated as follows. 

Let a; > 0 and set 

L n (—x) = 22 F n k( — x) i L n (x) = 22 (^nk(x) — 1). 

k k 

Then, in terms of L and j8 2 of the limit i.d. law, the criterion conditions 
are 

L n ^> L and IZ <r 2 X € /3 2 as « <» then 0 < € 0 

k 

Furthermore, Levy functions L n have a direct probabilistic meaning in 
terms of the summands X n , k = \, ••• ，々 „: 

L n {—x) = £(number of the X n k in (— <», x)) 

— L n (x) = E (number of the X n k in ①））. 

3. The proof of the ^-criterion 24.3B is, in fact, in terms of L. For, 

the functions J and J~ therein are given by J(x) = — L(e x ) and J~(x)= 
L(-e x ). ' 

Levy functions of stable laws within the proof of 24.4B are: 

7=2: L = 0 — normal law 
0 < 7 < 2: dL{x) = 0/\x\ 1+y dx for a; < 0, 
dL{x) = fi f /\x\ 1+y dx for x > 0 

CLP for iid summands. In what follows,/,, and /„ are ch.f/s. 
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We intend to solve directly the Central Limit Problem (CLP for short) 
for independent identically distributed {iid for short) summands. We 
shall use A and generalize results in 24.1 replacing/ n n = f for every n 
by/n n —j. 

b. If f n -^ \ in some neighborhood [— U y + U] of the origin then on 
[- Uy U] y /rom some n = n{U) on y log/ n exist and are bounded and 

l0g/n = - £ ^(1 ~fn) m = {fn - 1)(1 + o(l)). 

m»l 

For, on [—U y + U],/ n 1 uniformly so that, from some n = n{U) on, 

11 ~fn\ < 1/2 hence log/„ exists and is continuous and thus is bounded, 
and 

log/n = log(l — (1 — f n )) = — (1 — fn) — ^(1 ~ fn ) 2 — ••- 

=(/n - 1)(1 + 0 ⑴). 

We generalize 24.1b: 

c. ///»" — J then f has no zeros and the same is true when e~ iuan f n n {u) 
f(u) for every u ^2 R. 

Proof. It suffices to prove that ch.f.’s (K«| 2 ) n |/| 2 implies l/| 2 > 0. 

Suppose this “symmetrization” already took place so that j n n — j with 
/n and / ^ 0. 

Since/is continuous with/(0) = 1, there is a finite interval [-U, -\-U] 
on which / > 0 Kence log/exists and is bounded. On this interval, from 
some n on, log/ n exist and are bounded, so that wlog/„ ^ log/ hence 
log f n ^ 0, that is,/„ ^ 1， a applies 

»(/n - 1)(1 + o(l)) = n log fn log/ 

and n(/ n — 1) remain bounded. Since, by 13.4A 

»(1 -/n(2«)) ^ 4»(1 -/„(«)), 

it follows that on [-2U, +2(7], from some n on, w(l —/„) ^ 0 remain 
bounded, so that^ 1, a applies and e n(fn ~ l) > 0. 

Upon continuing this doubling of the intervals, any given u C R belongs 
to an interval on which / > 0 hence / > 0 on /?, and the proposition is 
proved. 

B. Iid convergence criterion. Let ^ be continuous 
n(f n — 1) ^ \f/ t and then j = e* is i.d. 
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More generally ^ if f n — \ or a n /n —*0 then ^ for every u ^2 R t 
e~ iua n/ n n (u) ~^f{u) «=> —iua n + n(J n {u) - 1) ― 屮， 
and then J = e* is i.d. 

Proof. 1°. Let n(f n — 1) ^ \f/ } so that ^ 1, b applies, and /„ n 

e* =/. 

Conversely, let f n n — f so that, by c, / has no zeroes and log / exists 
and is continuous. Given any finite interval, it follows that on it, from 
some n on, log/ n exist and are bounded and, by b, w(/„ — 1) ^ log/ = \f/. 

2°. Let iua n + w(/ n («) — 1) — \p{u) for every u ^2 R so that 
— iua n /n + /n(«) — 1^0 hence a n /n ^ 0 ^ 1. With either of 

these equivalent conditions b applies and, for every u C. R, from some 
n = n{u) on, 


e~ iuan / n n (u) = (e~ iuanln f n (u)) n = /(«). 

Conversely, let for every « C jR> 

(e~ iua n /"/„(«))" = e_ iua /„"(«) —/(«) 

so that, by c, /(«) 0 hence e~ iuanln / n (u) 1. Thus, once more, 

a n /n <=>/„ ^ 1 and, with either of these equivalent conditions, b applies 
and —iua n + n{f n {u) — 1) ^ log/(«) = \{/(u). 

It remains to show that the limit ch.f. / is i.d. This will follow from 
the “structure” proposition below. In fact, this proposition provides a 
widening of the definition in 23.1 of i.d. laws since/„ n = / for every n 
impliesbut, in general, the converse is not true. It also provides 
a direct probabilistic proof of the structure theorem in 23.1: 


Let S。= 0, = 不 + • . . + 又 ， » = 1, 2, • • • , where the sum¬ 

mands are iid with common ch.f. f. Let X ^ 0. We say that a r.v. is 
(Xj)-compound Poisson if its d.f. is 

oo \ n 

Fs = e~^ 2 Zl ^s n - 

O fl • 


Clearly Fs is a d.f.: It is nondecreasing with (一 00 ) = 0, + <») 


I ： 


X n 


n\ 


L The corresponding ch.f. is immediate: 
fs 




It is an i.d. ch.f., since > (/_1) is the ch.f.of a (X/m, /)-compound Poisson 
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for m = 1 ， 2, ... . Also the centered ch.f. e~ iua+M/ ~ l) is i.d. and the i.d. 
assertion in B follows at once. And B yields 

Structure corollary. / is i.d. if and only if there are compound 
Poisson /„ with ^/. 

C. IlD CENTRAL CONVERGENCE CRITERION. Let X nk) k = W, be 

iid summands with common d.f. F n and ch.f./„ ^ 1. Let x > 0. 

<£($ X n k — a n ) 一 necessarily i.d. with ^ = (o, /3 2 , L) 
if and only if 

(C L ) : Z,„ A Z, with L n defined by 
L n {—x) = nF n {—x) y L n (x) = 23 n {Fn{x) — 1), a; > 0. 


(CV) : »J X y dF n {y) 


—as n — co then x 0. 


(Co )： a n = a n — a + o(l) with a n = n 


u n ^ u n — u -r o\ij wica an = 71 J dF n {x). 

Note that (C a ) characterizes all admissible a n . 

Proof. According to B, the required convergence is equivalent to 

ypn{u) = —iua n + n J (e ivT — 1) dF n {x) 沴(《)， u G R where, setting 
a … f rh^ dF n {x), 

^n(tt) = iu(a„ — a n ) n f(^e iux - 1 - Y^~^jdF n {x) 

= (a n - a ni ^„ 2 , L n ) } 
with L n defined by 

L n (—x) = nF n {—x) y L n {x) = n{F n {x) — 1), a; > 0, 


corresponding defined by 


^n(z)= 

and j3„ 2 determined by 


Lo r+7 dF -(y^ z ^ R > 


n I y- dF n (y) = I (1 y 2 )d^ n (y) 


y- dL n {y) + fin 2 . 


The asserted criterion follows at once from A. 
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COMPLEMENTS AND DETAILS 

/• Prove Lindeberg’s theorem without using Liapounov’s bounded case 
theorem. Then deduce Liapounov’s theorem. 

For Lindeberg’s theorem use the expansion 

/ k ( z ) = 1 -感 “ 2 + 。 S (奶 >1 + dF ^) 

To deduce Liapounov’s theorem observe that 



L.f dFk 


E\ X k 

Jn 2+i 


2. Prove directly the sufficiency of Kolmogorov’s conditions for degenerate 
convergence. Then deduce the condition in (1 + 5). 

3. Deduce the Kolmogorov and Lindeberg-Feller theorems from the degen¬ 
erate and normal convergence criteria—•where existence of moments is not 
assumed. 

4. Deduce the bounded variances limit theorem from the Central Limit 
theorem. 


5. Let ^nk be sums of independent uan summands centered at expecta- 

k 


tions with X <r 2 Xnk = 1 whatever be n. Then 

k 


£(Z X^) 31(0, 1) «=> ^ 1. 

k k 


(Observe that the last convergence is equivalent to X) dF n k — > 0 what¬ 

ever be e > 0.) 

6. Let f (/ 4 - iu) y t > 1, be the Riemann function defined by 


f(/ + /«) = 53 n- l ~ iu = XX (! - P~ l ~ iu ) 

n p 

where p varies over all primes. ft(u) = f(/ + /«)/『(/) is an i.d. ch.f. 

(logMu) - i)/».) 

p n 

7. An i.d. law may be composed of two non i.d. laws. In fact, there exists a 

non i.d. ch.f. / such that |/| 2 is i.d.: form the ch.f. f of X with P[X = — 1]= 
p(l - p)/{\ + />), P[X = = (1 - / >)(1 + PV/(1 + P), 是 = 0, 1， • • •，0 < 

p <1. • 

(Put / in the form (a, f); observe that ^ so found does not satisfy the neces¬ 
sary requirements. Put \ f | 2 in the form (a, 少 )•) 

8. An i.d. law may be composed of an i.d. law and an indecomposable one: 
let A* = 0 or 1 with pr.’s 2/3 and 1/3, respectively; the ch.f./ is indecomposable 

log/(«) = log = ^ a ^( einu - !)，H I I < °°- 
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Set 


log/ + («) = a n{e inu — 1), log 广 (tt) = a n(e inu — 1) 

where d denotes summation over positive (negative) a n . Then /+ 
and/— are i.d. and/ + = jj 一、 

Also an i.d. law may be the product of an i.d. law and two indecomposable 


ones: proceed as above but with / defined by 


5+4 cos u 

9 


2 + e iu I 2 


9. P. Levy centering function. The family of i.d. laws coincides with laws 
defined by 

log/(«) = iau — ^e iux — 1 — dL{x) 


where L is defined on except at the origin, is nondecreasing on ( — 00 , —0) 

• r+T 

and on (+0, +<»), with 1( 千 °°) = 0 and x 2 dL(x) < 00 for some r > 0; the 

r 

barred integral sign means that the origin is excluded. 

Also 

log/(«) = ia{r)u — y « 2 (e iux - 1 - tux) dL{x) 

+ (/_ +/:) {giux ~ 1} dL{x) - 

This splitting of the domain of integration replaces the P. Levy centering 
function 茗 Cv) = x/(l + x 2 ) by much simpler ones (^(^) = x and g(x) = 0) 
within the partial domains of integration. 

Why was the centering function needed? Then, what are the conditions to 
impose upon it? Show that Feller’s centering function 叉 (x) = sin x is acceptable. 
Is the following one acceptable: g(x) = x for | x | < r for some finite positive 
constant c y g(x) = c for x ^ c and = —r for x ^ r? 

10. Let r.v. s X n ,k with dX’s F n ,k, 々 =1, • • •, 是 n — 的， 》= 1,2 ，…， be 
independent in k and uniformly asymptotically distributed in k y that is, there 
exist d.f/s F n such that F n ,k 一 — 0 uniformly in k. The nondecreasingly 
ranked numbers X nt k(^) into I* n ,i(w) S - - * ^X*n t k n (^) determine “ranked” 
X*n t r of u rank M r; the *X n ,8 = X nt k n ^i^8 are of “end rank” Set 

L n = Z FnM M n = I ： (Fn t k - 1), 

k k 

Sn t rn = (r„-z F n , k )/VZ FnM - Fn,k\ 

k 

J-n = In,k，In = (In - EIti)/^J- ny In t h{^) = ^[Xn»k <a ； r* 

k 、 


Use throughout the fundamental relation 

[X*n t r < X] = [In(x) ^ r]. 


a) The X* nt r are r.v. s. 

b) For fixed ranks r, the class of limit laws of ranked r.v.’s X n ,r is that of laws 
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£(X^ r ) with d.f/s F r L — J 「 r : ^ 沿 , where the functions L on R are 

nondecreasing, nonnegative, and not necessarily finite. 

These limit laws are laws of r.v/s if and only if L(—oo) = 0, L(+°°) = +oo. 
And 

F\.r ^ Fr L ㈡ 厶 „ V 厶 . 

c) For fixed endranks s y the class of limit laws of ranked r.v.’s ^X n9 is that of 

/ +°o 产 一 1 

； - rrr dt where the functions M on R 

-M (s - 1)! 

are nondecreasing, nonpositive, and not necessarily finite. 

These limit laws are laws of r.v.'s if and only if M( — oo) = — oo, A/(+°°) = 0. 
And 

*F nt M F t ㈡ M„ 子 M. 

d) For variable ranks r n — oo with + 1 — r n —► °o, the class of limit 

1 r 00 

laws of ranked r.v.’s X* nt r n is that of laws with d.f/s F e = ― y= I e~ ttf2 dt y 

y/ 2ir 

where the functions g on R are nonincreasing，and not necessarily finite. 

These limit laws are those of r.v/s if and only if g(—^>) = +oo, 欠 (+oo) = — oo. 
And 

. F*n,r n ^ F e gn,r n g. 

e) What if the X„k are uniformly asymptotically negligible? What if, moreover, 
£(E X nk ) £W? 

k • 

f) What about joint limit laws of ranked r.v.’s? 

".Let £(X n — a n ) (a, ]9 2 , L) where X n k are sums of uan inde- 

, k 

pendent r.v/s. 

(a) The sequence £(max | Xnk |) converges. Find the limit law <£(^). Why 

k 

can necessary and sufficient conditions for normality of the limit law of the 
sequence £(X n — a n ) be expressed in terms of <£(^0? Are there other i.d. laws 
for which this is possible? (For n sufficiently large and x > 0 

log P[max j X nk I < 太 ] = 一 (1 + o(l)) X) P[\ Xnk \ ^ x].) 

k k 

(b) Let ^nk = f t x ^Fnky r > 0 finite, F n k(x) = Fnk(x + ank) and let F\k 

J \x\ <T 

be the d.f. of X f n k = | X n k — a n k \ r (or a fixed r > 1. 

If £(Z) ^nk — a n ) —► (a, j8 2 , L) y then there exist constants a 9 n such that 

£(E A 」 〆》) — 0, Z/) with L\x) = 0 or L(x^ r ) - according 

k 

as x < 0 or x > 0. (If ^ ^ 0 is even, then, for every r > 0, 

J = J[ 2 | <c l/r^l^| r )^> J c ^ dF，nk = Xl>* 1/r ^^ 冲 ) 爪 . 
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Take g = 1 and g(x) = xK Observe that 

0 On * - (/»、 

(c) £(23 Xnk — a n ) 91(0, j 8 2 ) if, and only if, 22 j 8 2 . What about 

* k 

limit Poisson laws? 

In what follows and unless otherwise stated, degenerate laws are excluded; 
/> with or without affixes, is a ch.f.; and, without restricting the generality, the 
type of/ is the family of all ch.f/s defined by J{cu) for some c > 0. 

^2. f \s decomposable by every/", w = 2, 3, • • •, if, and only if,/is degenerate. 

13. j is decomposable and every component belongs to Its type with /(«)= 

Yi if and only if / is normal. 

14. If for an r > 0 and 5 ^ 1 ,/ r belongs to the type of/, then/isi.d. If there 
are two such values 〆 and r" of r and log r f /log r n is irrational, then / is stable. 

15. 1(f n —► JJ\ f and/n for every then j f is a component 

of /. 

/ is ^-decomposable if /(«) — f{cu)fc{u) for some fixed c necessarily be¬ 
tween 0 and 1. L c is the family of all ^-decomposable laws, Lo is the family of 
all laws, and Li is that of self-decomposable ones. 

(a) Lo 3 3 Lu and if c/\og c f is rational, then L c = L c ，. Every L e 

is closed under compositions and passages to the limit. 

(b) /G L e if, and only if, it is limit of a sequence of ch.f/s of normed sums 
S n /b n of independent r.v.’s with bjb n ^\ —► c. 

(c) f L c if, and only if, it is th.f. of X{c) = X) ikc k where the law of the 

series converges and the arc independent and identically distributed. Then 
the series converges a.s” and=* / c . If ^ is bounded, then / is not i.d. 

(d) g(x) is said to be 7 -con vex (7 > 0 fixed) if every polygonal line inscribed 
in its graph with vertices projecting at distance y on the x-axis is convex. 

If is i.d., so is X(c). /i.d. with Livy’s function L belongs to L c and f c is 
i.d. only if (一 arc 7 -con vex for 7 = | 1(^ c | where Mj are defined as in 9. 
I 3 the converse true? 

(e) If Eik =* 0, <r 2 ^k ** 1, then, for c y c f C (—1 ， +1)> the covariance 
EX{c)X{c , ) = 1/(1 — cc r )y and the random function X(c) on (—1, +1) exists 
in q.m. and is continuous and'indefinitely differentiable in q.m. 







Chapter VII 


INDEPENDENT IDENTICALLY 
DISTRIBUTED SUMMANDS 


This chapter is devoted to study in some depth of consecutive sums 
6 1 !, ^ 2 , • • • of sequences of independent identically distributed sum¬ 
mands Xiy Xiy • - • with common law £(X); we shorten “independent 
identically distributed” to i'td. As usual, methods are emphasized. 
Methods and results took their definitive form in the third quarter of 
this century. 

In the preceding chapters some results about iid summands were ob¬ 
tained: Kolmogorov law of large numbers (17.3B) and its generalization 
17.4, 4°, convergence of laws of normed sums to normal when the 
summands have finite second moments (21.1 A) and the far-reaching 
characterization of all limit laws of normed sums (24.4), by particular¬ 
izing the solution of the general central limit problem. 

In this chapter, using directly 24.5, by means of Karamata theory, we 
obtain in §25 the above limit “stable” laws and their “domains of at¬ 
traction” 一 those families of laws for which the laws of normed sums 
S n /K — a n converge to any given stable one. 

In §26, we study ‘‘random walks ’’； sequences of sums ^i,^, - • - 
themselves (not normed), their global and asymptotic behaviour with 
their dichotomy into “recurrent” and “transient” ones, and their fasci¬ 
nating ‘‘finite fluctuations.” 

§25. REGULAR VARIATION AND DOMAINS OF ATTRACTION 

The domain of attraction of the normal law was found by P. L6vy, by 
Feller, and by Khintchine. The domains of attraction of all other stable 
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laws were discovered by Doeblin and by Gnedenko. Much later, Feller 
observed that these results were in terms of Karamata regular variation 
theory and showed its usefulness for various limit probability problems. 
We follow his presentation of Karamata theory, and then apply it to the 
problem of stable laws and their domains of attraction. It deems ad¬ 
visable that at the first reading only A and its Corollary be covered in 
25.1 and c in 25.2 be assumed. 

25.1 Regular variation. Let U y Vht positive monotone functions on 
[0,<») to [0,<») and let be positive. 

We say that U varies regularly (at + °°) with exponent a R 
XJ{x) = x a V{x) where V varies slowly (at + °°), that is, V{tx)/V{t) —^ 1 
as t ― > 00 for every x. Thus slow variation is regular variation with 
exponent 0. Since our only concern is with behaviour at +°°> we may- 
take x y y > c R with f > 0 arbitrary but fixed, or substitute (f, ①) 
for [0, °°), or assume that U y V vanish on [0, f】；this will be done without 
further comment. 

A. Regular variation criterion. Let D be a set dense in [0, ①）. 

U varies regularly if and only if 、 for every x D y 

U(tx)/U(t) —>• h{x) < 00 as / —^ 00 , 

and then h(x) = x° for some a R. 

Proof. The “only if” assertion is trivially true. As for the as¬ 
sertion, letting oo in 

U{tx) U{txy) U{ty) 

W = "^T W, 

it follows that 

h{xy) = h{x)h(j) for x y y C. D. 

Since U\s monotone, this functional equation extends to [0, °°) by taking 
limits from the right. But then it has a unique finite solution of the form 
h(x) = x a for some a R y and the proof is terminated. 

Corollary. If for every x D dense in [0, 00 ), 

c n XJ{b n x) h{x) finite positive 

and 

彡 n — 00 ， Cn+l/Cn 1> 

then U varies regularly and h(x) = cx° for some finite a and c > 0. 
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Proof. If n is the smallest integer such that b n t < 彡 „+i then 

〆 U((x) ^ U{b n+l x) 

W = U{b n ) 

where Uis nondecreasing, while these inequalities are reversed when Uis 
nonincreasing. By a change of scale we may assume that 1 e D. Then, 
since c n+ i/c n —* 1 and c n U(i> n ) A(l) = c > 0, for every x C. D the ex¬ 
treme terms converge to h(x)/c hence XJ{tx) / U(t) h(x)/c ) the above 
criterion applies and h(x)/c = x a for some a R. 

*Let iiT be a positive monotone function on [0， ① ）and set 

x co 

Ua(x) = f Q y a ^(y) V a {x) = J y a H(y) dy 
where x > 0 and a are finite. 

Upon replacing if necessary 0 by f > 0, or assuming that H vanishes on 
[0, c]y U a (x) will be finite while V a {x) may be infinite. Since 

U a {x) f C7 0 (<») and V a {x) [ V a {<^) as ^ 

while 

C/ a (co) = U a (x) + V a {x) hence C / a (①） =C7 a («0 + F a («), 
it follows that 

U a ( m ) < 00 «=> Va{ m ) = 0 => V a {x) < 00 from some x on 
U a ( m ) = <»«=> V a {x) = oo for every x «=> ^o(°°) = 00 • 

a. Let H vary slowly. Then C/〆 00 ) and V a { ①、 are finite for a < —1 
and infinite for a > —1. Furthermore 

(i) If a ^ —1 then XJ a varies regularly with exponent a 

(ii) If a < —\ then V a varies regularly with exponent 0 + 1 ， and this 
still holds for a — 1 provided V-y is finite. 

Proof. Given x > 0 and e > 0, slow variation of H implies existence 
of 5 > 0 such that, for jy > 5, 

(1) (1 - e)H(y) ^ H(xy) 刍 （1 + e)H(y). 

1°. Let ^o(°°) = 0 hence V a {x) < °° for some x on, and U a («>)<«>. 
Since 

oo 

Vaitx) = ^ +1 f y a H{xy) dy y 


t 
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it follows that, for / > 5, 

(1 - t)x a + l V a {f) ^ V a {tx) ^ (1 + 0 WoW 

hence, letting / — ① then e —> 0, V a {tx)/V a {t) —* x° +1 . Thus, V a varies 
regularly with exponent « + 1 ^ 0 since V a is nonincreasing, and 
U a («>) < oo with 厂 o(<») = 0 only if a ^ —1. 

2°. Let U a («>) = oo hence ^ 0 (°°) = 00 • Since, for / > 5, 

U a {tx) = U a (8x) + ^° +1 I y a H(xy) dy 

hence, by (1), 

(1 - 〜 + 灿 ） S U a (tx) - Ua(Sx) ^ (1 + 0#+ 爾)： 

upon dividing by U a (t) and letting /—> <» then e —^ 0, it follows that 
UJp^/UJf) x a+ K Thus, U a varies regularly with exponent 
a + 1 ^ 0 since XJ a is nondecreasing, and U a (°°) = 00 hence V a (<»)= 
oo only if a ^ —1. The assertions follow from 1° and 2°. 

*B. Main Karamata theorem. Let H be positive monotone on [0, <») 
and set 

X GO 

U a (x) = f y-H{y) dy, F a (x) = j y-H{y) dy. 

Jo 

(i) If H varies regularly with exponent b ^ 一 a —1 and V a {x) < 
then, as t — ① , 

t a ^H{t)/V a {t) — r = -(a + 々 + 1) 2 0. 

Conversely^ if this limit exists and is positive then V a and H vary regularly 
with exponents —c = ^ + ^ + 1 and b 、 respectively y while if this limit is 0 
then V a varies slowly• 

(ii) If H varies regularly with exponent b ^ 一 a —1 then y as t — 说 y 

t^ l H(t)/U a (t) -^c=a + b+\^Q. 

Conversely, if this limit exists and is positive then U a and H vary regularly 
with exponents c = 0 + 彡 + 1 and b y respectively，while if this limit is 0 
then U a varies slowly• 

Note that when ^ = 0 the converse assertions for ^ > 0 continue to 
hold for V a and for U ay but nothing can be asserted regarding H. 

Proof. The argument for (i) and (ii) is the same, and we shall prove (i). 
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( 1 ) 

Since 


h{y)/y = y a H(j)/V a (j). 


y a H(y )= — 


dV a {y) 


upon integrating (3) over [/, tx) with x > 1, it follows that 


-log 


K(tx) 


U y 


dy = h(f) 


h{tz) 1 
h{t) z 


Let H vary regularly with exponent b so that, by a, V a varies with ex¬ 
ponent a b \ = —c. Thus, both sides of (1) vary regularly with 
exponent — 1 and h varies slowly. Therefore, as / —^ 00 , the integrand in 
the last integral in (2) tends to 1/z while the first term in (2) tends to 
c log x and Fatou-Lebesgue theorem implies that limsup h(t) ^ c. 
Thus, h is bounded so that there is a sequence / n —^ 00 with h(t n ) 
c ^ f < 00 . Since h varies slowly, h(t n y) c' for every jy > 0 and, by 
the dominated convergence theorem, c log x = c log x hence c =■ c for 
every such sequence (/ n ). Therefore, h{t) —^ f as / —^ <» and the direct 
assertion is proved. 

Conversely, if the limit f ^ 0 exists so that h{t) — f as / — 00 then, 
by (2), V a varies regularly with exponent c. Moreover if f > 0 then this 
property oi V a together with (1) implies regular variation of H with ex¬ 
ponent —c — a — 1 = b. This completes the proof of (i) and (ii) is 
proved similarly. 

*C. Slow variation criterion. H varies slowly if and only if 


H{x) = h{x) 


ex ^{l 


where g(x) 0 and h{x) —^c< <» as x—* <». 

Proof. The “if” assertion is easily verified. As for the “only if’ 
assertion, let H vary slowly. Then, by B(ii) with a = i = 0 t 

H(t)/U 0 (i) = (1 + g(t))/t with g(t) — 0 as / — oo. 

c :_uf a _ dUoif) __j_ri \ … •!. i •. r_ii_ 


Since H(t) 


,upon integrating over [1〆）with x > 1, it follows 


Uo(x) = C/o ⑴ 


xexP {[ 
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But, by B(ii), 

H{x) = h(x) Uo(x)/x y 

and the ‘‘only if’’ assertion obtains. 

Corollary. If H varies slowly then ， as x — ① ， H(x + y)/H{x) 1 

andy given 5 > 0 ， x~ s H(x) —> 0, x 5 H(x) —> °° y and x~ 6 < H(x) <. x s 
from some x on. 

*Let G be a d.f. vanishing on ( — °° ， 0). Let x > 0 be finite and set 

x co 

= I y a ^G(j) y Vp(x) = I y^dG(j). 

Since we are concerned only with asymptotic behaviour of these inte¬ 
grals, whenever convenient we do take G = 0 in some neighborhood of 
the origin. We assume that 

M«(°°) = lim n a (x) = oo, ^(oo) = lim vp(x) = 0 

X-*CO X-*CO 

so that a > 0 and — <» < p < a. 

The elementary integration by parts which follows will reduce the 
question of regular variation of fi a and of vp to the main Karamata 
theorem. 

b. Integration by parts lemma. Let x be a continuity point of G 
hence of ti a and of vp. Then 


(i) fi a (x) = —x a S{x) + (« — j8)J o y p ~ a ~ l vp(j) dy 




■x p ~ a na(x) + (a — /3) I y p ~ a ~ l ti a {y) dy. 


Proof. Relation (i) results at once from integration by parts of 
Stieltjes integrals. Relation (ii) requires also a passage to the limit: 
Integration by parts on [x t t) with / > 1 continuity point of G yields 

⑴ v^{x) - vp{t) t 

= —x p ~ a Ha(x) + + (a - /3) I y p ~ a ~ l ^a{y) dy. 

^ X 

Thus ， 


(a — /3) I y^~ a ~ l fi a (y) dy ^ Pfi(x) + x fi ~ a fi a (x) 
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so that, letting / — 00 ， the limit of the integral on the left is finite. Since 
Mais nondecreasing,as/—^ <», 


^ J 〆… V«O0 办 — 0 

hence — 0 and, letting / — ① in (i), (ii) obtains. 

*D. Variation of truncated moments. Let n a (°°) = « and 
vp(°°) = 0 so that a > 0 and — <x> < jS < a. 

(i) If n a or v p varies regularly y then ，as x — ①， 


—c= 2 0， /3 ^ 7 ^ a. 

p y — p 

(ii) Conversely 、 if this limit exists therefor /3 < 7 < a, /i a and vp vary 
regularly with exponents a = a — y > 0 andb = /3 — 7 < 0, respectively ， 
while a = 0 when y = a and b = 0 when y — 

Note in the boundary cases while n a varies slowly when y = a and 
varies slowly when 7 = /3, nothing can be asserted regarding vp or 
ti a y respectively. 


Proof. 1 °. Let ti a vary regularly with exponent u. Finiteness of the 
integral in b(ii) yields u ^ a — j8. Since n a is nondecreasing « ^ 0. 
Thus, setting u = a — 7 , we have /3 ^ 7 ^ a with 7 ^ 0 . Now, b(ii) 
yields 


x a ~h p {x) 

fla(x) 


a — /3 
x^- a fi a (x) 


tfi—a—l 


Pa {dy) 


so that, using B(i) with H = and a = /3 — a — l，asx — ①， 


x a ~ p v p (x)/fi a (x) — — 1 + (a — jS)/(7 — jS) = (a — 7)/(7 — / 3 ) = f 

with f = 00 when 7 = /3, and this is the asserted limit. Let vp vary 
regularly with exponent so that v 刍 Since vp is nonincreasing t; ^ 0. 
Thus, setting t; = /3 — 7 , we have /8 ^ 7 ^ a with 7 ^ 0 . Proceeding 
as above but with b(i) in lieu of b(ii) and using B(i) but with H = vp 
and a = a — /3 — 1, once more the asserted limit obtains and (i) is 
proved. 


2°. Conversely, let the limit c = (a — 7)/(7 — /3) exist. If 0 < 
c < <x> then ( 1 ) yields, as x 1 —^ <», 

( 2 ) x p ~ a fi a (x)/ y p ~ a ~ l Ha (dy) —> (a — 0)/{c + 1 ) = 7 -/ 8 . 



360 INDEPENDENT IDENTICALLY DISTRIBUTED SUMMANDS [Sec. 25] 


Using B(i), it follows that fi a varies regularly with exponent a — y > 0 
while, by (1), v p varies regularly with exponent /3 — 7 < 0. If f = 0 
the same argument shows that varies slowly but yields nothing about 
v p . Similarly, if c = «> then 〜varies slowly but nothing can be asserted 
about m«. 

The proof is terminated. 


25.2 Domains of attraction. Throughout this subsection, Xi’Xz ， 
.• . are iid r.v.'s with common law d.f. F y ch.f.f and S n = Xi + • • • 

- {-X n> n = 1, 2, • • • ; take x > 0 and set ^{x) = I y 2 dF{y) y q{x )= 

"* ^ -X " 

1 — F(x) + F(—x). 


We say that £(X) belongs to the domain of attraction of a law £(Y) or 
is attracted by £(Y) —— an attracting law 、 if there are a n and ^„ > 0 such 
that £,{S n /b n — a n ) — <£(y). We exclude the trivial case of degenerate 
attracting laws £(Y) for, according to 14.2, every £(X) belongs to its 
domain of attraction with suitable a„ and b ny and this excludes considera¬ 
tion of degenerate £(X). In fact, always according to 14.2, the above 
definition pertains not to individual laws but to types of laws. 

In terms of ch.f.’s 、 £(X) is attracted by £(Y) nondegenerate means 
that, for every u C. R } 

e iua n f n (u/lf n ) /y(u) nondegenerate. 

Thus, ch.f.’s |/(«/^„)| 2 —> |/y(«)| 2 , so that |/y(«/^„)| 2 —^ 1 with nonde¬ 
generate |/y| 2 hence . It follows that also Su{S n /b n +i — a„) 

£(Y) y that is, |/(«/^ n+ i)| 2 |/y(«)| 2 and, by the Corollary to 14.2A, 

彡 n+l/^ 1 ： 

a. If & {S n /b n — a n ) £(Y) nondegenerate^ then b n — ① and b n+ i/ 



Since f{u/b^) 1, 24.5C applies with X n k = Xk/b n hence F n (x)= 

F(J>nx) y n = 1， . . . ， w, and 

b. £(S n /i n — a„) £,(Y) nondegenerate — necessarily i.d. with ^ = 

(a,/3 2 , Z.), if and only if 、 

(Cl )： L n ^ L where L n (—x) = nF{—b n x) y L n {x) = n(F n (x) — 1). 


(C/ 3 1 ) : nyLi{b n x)/b^ = nj y 2 dF{b n x) —^^ 2 asn—* <» then x — 0. 
(C a )： a n = <x n — a + o(l) where oc n = n f 
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With the help of these lemmas, used without further comment, we 
begin by investigating condition (Cl) and its implications for Levy 
functions of (nondegenerate) attracting laws £(Y). Clearly, the Levy 
function for normal £(y) is L 2 = 0, and conversely. The others are 
given by 

A. Levy functions and (Cl). Let x > Q. 

(i) Levy functions L y of nonnormal attracting laws (Y) are given by 

L y ( — x) = cp j x\ L y (x) = — cq/x y 

where 

0 < 7 < 2 , f > 0 , p } g ^ 0 with p q = \. 

(ii) Condition (Cl y ) is: as x ① 

F(—x)/q(x) —^p or (1 — F(x))/q(x) 1 — p 

and 

q{x) = (f + o(l))A(x) where h(x) varies slowly. 

The admissible b n are characterized by nq{b n x) c as n ①. 

Proof. Condition (Cl) reads: for ±x C C(Z.), as » —» «», 

(1) nF(-Kx) L(-x) and (2) n{F{b n x) - 1) L(x) 
hence 

⑶ nq{b n x) —» L{—x) — L{x). 

In fact, any two of these three relations clearly imply the remaining one. 

1°. Since Z. = 0 is excluded, there is an ^0 > 0 such that L(—Xo) — 
L(xo) > 0 hence L(—x) — L{x), being nonincreasing with increasing x y 
is positive for x C (0, x。】. It follows that the Corollary of 25.1 A applies 
to (3) so that, setting L y = L, a.s n °o y 

⑷ nq{b n x) —» L y {—x) — L y (x) = c/x y 

with c > 0 and 7 > 0 . 

On the other hand, upon changing in (1) and (3) the fixed x into fixed 
y and for every x > Q selecting n to be the smallest integer such that 
b n y ^ x ^ 彡 n+OS we obtain 

n in + \)F(—b n+ iy) ^ F(—x) ^ n + 1 nF{—b n y) 
n + 1 nq{b n y) ~ q{x) — n (» + \)q{b n+i y) 
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Upon letting x—^ «» so that »—»<»，the extreme sides converge to 
P = Ly{-y)/{L y {-y) - L y (y)) so that 

(5) F(—x)/q(x) —^p with 0 ^ p ^ 1 

equivalently 

⑹ （1 — F{—x))/q{x) — 1 — p. 

Thus, replacing in (5) x by b n x with x arbitrary but fixed, as » —» ①， 

nF{—b n x)/nq{b n x) -» p 

hence, by (4) and (1), 

( 7 ) nF( — b n x) —^Ly(—x) = cp/x y 
and, similarly, 

( 8 ) n{\ — F{b n xy) — > L y {x) = cg/x y . 

Since the requirement for any Levy function, I y 2 dL y {y) finite, is 

^ —* 

satisfied if and only if 7 < 2, we must have 0 < 7 < 2. Thus (i) — the 
asserted form of Levy functions L y of nonnormal attracting laws £>(Y ) — 
is established. 

2°. Condition (Cl 7 ) became: 

(5) lim F{—x)/q{x) = p, 0 丝夕刍 1 ， 

*-♦00 

and 

(4) lim nq{b n x) = c/x y y c > O y 0 < y < 2. 

n 

According to the Corollary of 25.1 A (4) implies 
⑼ = (f + o(l)) 々 (x) with h{x) varying slowly. 

On the other hand, setting ^ = 1 in (4), the scale factors b n must satisfy 
( 10 ) lim nq{b^) = f > 0 . 

n 

Thus, if £>{S n /b n — a n ) —» £(Y) nonnormal then (5 )， ⑼ and (10) hold. 

Conversely, let (5 )， (9) and (10) hold. From (10) it follows that b n = 
tnf{x\ q{x — 0 ) ^ ^ q{x + 0 )} —» 00 hence 

lim nq{b n x)/c = lim qib n x)/q{b^) = x~ y lim h{b n x)/h{b^) = x y } 
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and lim nq{b n x) = c/x y . Thus, (Cl 7 ) holds and admissible b n satisfy 
(10). The proof is terminated. 


Remark. Clearly, (Cl 7 ) can also be stated in a more symmetric 
form: 


F(—x) = cp{\ + o{\))h{x)/x^ and 1 — F(x) = cq{\ + o{\))h(x)/x^. 

In order to complete A we need more of Karamata theory. We write 
v.s. for “varies slowly. M 


*c. Slow variation lemma. Let ^ . 

(i) 1/ fi 2 (oo) = oo then 
for 0 < 7 < 2 : 

x 2 q(x)/fi 2 (x) —» (2 — 7)/7 <=> fii(.x)/x 2 ~ y v.s. <=> x y q(x) v.s. 


for 7 = 2 ： 


(ii) 


x 2 q(x)/fi 2 (x) —^ 0 <=^ n 2 (x) v.s. 

f x 2 q(x)/fi 2 (x) -» 0 


0 < fi 2 (x) < 



Hi(x) V.S. 


Proof. If/i 2 (°°) = 00 then (i) follows from 25.ID with G(x) = F(x) — 
F(—x) t a = 2, and /3 = 0, so that vq(x) = g(x). 

If 0 < ； x 2 (oo) < 00 then x 2 q(x) ^ I y 2 dF(x) —» 0 consequently 

^\y\ ~x 

x2 ^( x )/> 0 while, clearly, — 1 as / — ①， that is, /i 2 (^) 

varies slowly. 


Remark. Recall that when 0 < ^2 (%) 〈① then, taking X centered 
at its expectation and setting cr 2 = <r 2 X = M2( ① ) ， £( S n /a Vn) — >91(0,1) 
since 

f-{u/cV^Y = (1 -专 (1 + 。⑴) 丫 4 e-^ 2 . 

Thus, when 0 < /x 2 (①） < 00 then £(X) is attracted by normal <£(10, 
and other types of attracting laws may happen only when /| 2 (°°) = 00 . 


We say that £>(X) is stable if, for every n y there are a n and K> 0 such 
that £(S n /i n — a n ) = clearly, stable laws are attracted by them¬ 

selves. Note that these are “stable” laws introduced in 24.4, We write 
^y( c yp) f° r L y characterized by c and p as in A(i). 
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B. Stability and attraction criteria. Let x > 0. 

(i) The family of all nondegenerate attracting laws consists of all non¬ 
degenerate stable laws. They are i.d. laws with yf/ y = (a, /3 Y 2 , L y ) } 0 < 
7^2 and 


or 0 < 7 < 2 : 

/3 y 2 = 0, Ly(—x) = cp/x y , L y (x) = cq/x 2 y 
where c > p y q p q = 1 ; 
for 7 = 2 : 

jS 2 2 >0 ， Z.2 = 0. 

(ii) £(X) is attracted by some £> y with given y C (0,2] if and only if 、 as 

X —» 00 , 

x 2 q(x)/n 2 (x) — (2 - y)/y. 

£(X) is attracted by £ y with given L y (c } p) if and only as x « ， 
for 0 < 7 < 2 : 

F(-x)/q(x) ~^py q{x) = f(l + o(\))h{x)/xy 

where h(x) varies slowly 、 and admissible b n are characterized by nq(l> n ) —» c 
as n oo ; 

for 7 = 2 : 

fi 2 (x) varies slowly and admissible b n are characterized by nfi 2 (K)/K 2 
/3 2 2 > 0 » —» <». 

In either case, admissible a n are characterized by 


a n == a n — a - o(l) where a n 


n 


x 


1 + 妒 


dF{b n x). 


Proof. Stability assertion is immediate. For, every stable law is at¬ 
tracted by itself while, conversely, the attracting laws £ y are stable for 
b n = » 1/Y : use the form of L 6 vy functions L y in A(i). 

In A, we already found, for 0 < 7 < 2, (Cl 7 ) and the L y as well as a 
characterization of admissible b n . It remains to examine 


(Cy): n\ii{b n x)I —» /3 y 2 as » —» ① then »0, 
and to find admissible b n for 7 = 2 . 

1°. Nonnormal case: 0 < y — 2. (Cl 7 ) is given by: as x » ① 

(1) F(-x)/q(x) p, 0 ^ p ^ 

and 
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( 2 ) q{x) = ^(1 + o{\))h{x)/x y ^ h(x) slowly varying, f > 0 ， 
or, when not specifying c, 

⑶ x y q(x) varies slowly, 

while admissible b n are characterized by 

(4) nq{br) —» f > 0 as »—»<». 

We must have /X 2 (°°) = 00 , for 0 < ^ 2 (°°) < 00 implies normality, 
that is, 7 = 2 with Z ，2 = 0. But then c(i) applies and ⑶ is equivalent 
to: as x » ①, 

(5) (2 — 7)/7 

and to 

⑹ H 2 (x)/x 2 ~ y varies slowly. 

Upon replacing xby b n \n (5) and using (4), as » —» ①， we obtain 
⑺ »M 2 (^n)/^n 2 —^c'= cy /(7 — 2 ) > 0 . 

But ( 6 ) implies that as » —» <» 

nm(Kx)/K 2 2 
»M2 ( 彡 W 

hence, by (7 )， 

( 8 ) »M 2 (Kx)/K 2 —> 

Therefore, for 0 < 7 < 2 , (C 抑 ) becomes 

0 <r- nni(l> n x)/b n 2 — » /3 y 2 as » —» ① then »0, 

and we have the asserted = (a, 0, L y ) y and convergence. 

2°. Normal case: 7 = 2 . Nondegenerate normal laws correspond to 
少 2 = (a, ft 2 , 0) with /3 2 2 > 0. (Cl 2 ) and (C 的 ) become: as » —» ①， 

⑴ nq{b n x) -» 0 

and 

⑵ nn 2 {b n x)/br? —»jS 2 2 , 0 < /3 2 2 < 00 ; 

setting x = 1 ， admissible b n are characterized by 

⑶ W -» /3 2 2 . 
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If 0 < 以 ①） < 00 so that, by c(ii), varies slowly and x 2 q(x)/ 

fi 2 (x) —» 0 = (2 — y/y for 7=2, then it is easily seen that for X cen¬ 
tered at its expectation, (1) and (2) hold with b n = (rV«, <r 2 = <r 2 X = 
p 2 (oo)，and we have the required convergence (as we already knew). 
Thus, it remains to consider the case /X2(°°) = 00 • Then, by c(i), as 
x—* 00 , x 2 q(x)/m{x) — >0 is equivalent to n^{x) varying slowly. 

Let m(x) vary slowly so that fii(x)/x 2 —» 0 as x » ① • Then (3) holds 
for b n = sup{^: Hi(x)/x 2 ^ /8 2 2 /»} and b n —» ①， so that lim tn{b n x)/^(br) 

n 

=1 becomes, by (3), n\i^{b n x)Ib^- —> /3 2 2 > 0, that is, (2) holds. Since 
lim x^qix )/= 0, upon replacing therein x by b n x with ^ > 0 arbi- 

x~» co 

trary but fixed, we have 


hence, by (2)，lim nq{b n x) 

n 

slowly implies £(S n /K — 


= 0 

= 0, that is, (1) holds. Thus, n^(x) varying 
a n ) —» <C 2 for admissible a n . 


nq{b n x) 

nm(l> n x)/K 2 


Conversely, let £>{S n /b n — a n ) —» <£2， so that (1) and (2) hence (3) 
hold. We prove that M2 W varies slowly, that is, — H 2 (x))/m(x) -» 

0 as x —> 00 for, say, / > 1. Let x —> ① and let n be such that b n ^ 
x < K so that »—»«». Then, since 00 > (3) implies that nfi 2 (x)/ 
br? —» /32 2 > 0, that is, Hi{x) ~ ^b^/n. Since 》 n+i/》n — 1， by (1 )， 

彡 n 

iiiixt) - fi 2 (t) ^ x 2 dq{x) ^ t 2 b n+ i 2 q{bn) 

= t 2 (K+i 2 /K 2 ) (K 2 / ») nq {b,) = oib^/n). 


and the assertion follows. 
The proof is terminated. 

Consequences 


1°. For stable laws 

yf/yiu) = tau — c I u j 7 (l — ^h y (u)) y u ^2 R y 0 < 7 ^ 2, with c "> 0 
(c = 0for degenerate laws'), b = p — q hence | 彡 | 刍 1 and 

2 

hJu) = tan or - log| u \ according as y ^ 1 or 7 = 1. 

2 7T 

Follows from B(i) by the computations in part 2° of the proof of 24.4B 
where j8 is replaced by cp y /3 , by cq y and b and c are interchanged. 

2°. Nondegenerate stable d.f.^s F y are infinitely differentiable and 
I F Y (n) I ^ I ⑷⑼ I positive, for every n = 1 ， 2, • • 
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Proof. Since, by 1°, |/ 2 («) | = exp(— <:|«卜） with 0 <y g 2, c> 0,/ Y is 
integrable and so are the functions with values u n \/ y (u) | for every n. 
Therefore, the inversion formula becomes 

1 I p 一 iua a 一 iux 

F y (x) - F y (a) =~J e -~~du 

and we can differentiate n times under the integral sign for the integral 
so obtained is absolutely convergent so that 

F y w (x) = — J u n - l e~ iux f-,{u)du. 

It follows that J F y (n) (x) I ^ I 心⑷⑼ j > 0. 

Let g(x) = 1 — F y (x) + F y (—x) and let £ y be nondegenerate. 

3°. If £ y is a stable law with 0 < 7 < 2, then x y q(x) c > 0 as 

x < 0 . 

Proof. We know that £ y attracts itself with scale factors b n = n lly 
(also true for 7=2)； this also follows from \/ y (u) \ = exp(— 卜) 
since \/ y n (n lly u) \ = |/ Y («) |. Therefore, by B(ii), replacing b n by n 1/y in 
nq(b n ) —» f > 0, we have (n 1/y ) y q(n l,y ) —» c. Since q(x) is nonincreasing 
with x increasing, taking » 1/Y ^ x ^ (n 1) 1/Y , we obtain 

' »?((» + 1) 1/Y ) ^ x^q{x) ^ * ”?(” 1/Y ), 

where the extreme terms tend to f as x » ① hence »—»«», and 
x y q(x) —» c. 

4°. If £>{X) is attracted by <C Y then 

(i) E\X\ r < 00 for 0 ^ r < 7 ^ 2 

(ii) E\X\ r = «» for r > y when 0 < 7 < 2. 

If <£ (X) = £ y with 0 < 7 < 2, then E | X | r is finite or infinite according as 
0^r<y or r^y. 

Proof. If/x 2 (<») = EX 2 < «» then, by 9.3a, E | ^T| r < <» forr < 7 = 
2, while E \ X \ r may be finite or infinite for r > 2. This shows why 
7 = 2 is to be excluded from (ii) and also that it suffices to prove (i) 
when Ai2(°°) = 00 — even for y = 2. Then, by c, as x » ①， 

x y q(x) = c(l + o(l)) 咖)， 

where h{x) is slowly varying hence, by the Corollary of 25.1C, given 
5 > 0 there is an a such that, for x ^ a y 

x~ s < h{x) < x s . 
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On the other hand, by integration by parts, 


E\XV 


X \ r dF{x) = r I x r ^ x q(x) dx 


so that £| 义 | r is finite or infinite according 
or infinite. Since, given 5 > 0, for ^ ^ a y 
■v r ，一 5-i < x r ~ l -^h{x) < 



as I x r ^~ l ~ y h{x)dx is finite 

^ a 

— y — 8 一 1 


it follows that E \ X\ r < <» when lim x r ~ y ~ s < «» and E \ X\ r = <» 
when lim x r ~ y ~ d = «>. 

x^oo 

If 0 ^r< 7^2 then there is a positive 8 < y — r y the first limit is 
finite, E\X\ r < «» and (i) is proved. 

If r > 7 with 7 < 2 then there is a positive 5 < r — 7 , the second 
limit is infinite, E \ X\ r = «» and (ii) is proved. It remains to show that 
when & {X) = £ y with 0 < 7 < 2 then E \ X \ y = «». Since, by 3°, 


x y q(x) —*c > 0 , x y ~ 1 q(x) ~ cx~ l for x 1 —» « so that x y ~ l q{x) dx = < 0 1 
E \ = a> and the proof is concluded. 


§ 26. RANDOM WALK 

Random walks — sequences of consecutive sums of iid summands, are 
present, in various guises and various degrees of generality, in an in¬ 
credibly huge literature of applications of pr. theory to a very large 
number of concrete problems: queuing processes connected with mass 
service, dams, waiting times, renewal processes connected with storage 
and inventories, risk theory, traffic flow, particle counters, and many 
others. The present general random walk theory is relatively recent. 

In 1921, Polya discovers “recurrence” and “nonrecurrence” phe¬ 
nomena in his study of some simple random walks on lattices in R, R 2 , 
and R 3 . Thirty years later, in a definitive work, Chung and Fuchs 
settle this dichotomy problem for general random walks. Fluctuation 
r.v.’s defined on the n first terms of the random walk appear in the 
concrete problems mentioned above. But it is only in 1949 that Ander¬ 
sen begins his investigations into these r.v.’s for the general random 
walk. Since then a large number of results were obtained by many 
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authors. They use variants of either the combinatorial or the analytic 
methods. 

The combinatorial method initiated by Andersen threw the doors 
wide open. His approach was very involved. Spitzer simplified and 
unified the combinatorial approach and obtained some of the most im¬ 
portant identities and limit theorems of the theory. His book, while 
devoted to random walk on lattices only, contains a number of deep 
ideas and significant examples. Feller, using ladder indices and ladder 
variables, first introduced by Blackwell, reduced the combinatorial ap¬ 
proach to elementary mathematical arguments and using Feller’s ap¬ 
proach, Port, in a semi-expository paper, obtained a large number of 
known identities and generalized some of them. 

The analytic method, as used by Pollaczec since 1930, was very- 
involved and his work remained unnoticed until some of his results 
were rediscovered. Ray, Kemperman, Baxter, Wendell, . . . , simpli¬ 
fied and unified in various ways the analytic approach and obtained 
further identities. Kemperman’s book presents in detail the approach 
based on Liouville’s theorem (already used by Pollaczec) and contains 
a large number of examples. Baxter uses a method based on Fourier- 
Stieltjes transforms and operators on functional Banach spaces. Wendel 
introduces and investigates “order statistics” of (^i, . . . ， S n ) y • •. 

No attempt will be made here to apply the general random walk 
theory to concrete problems. The interested reader will find in Feller’s 
two volumes a large number of such problems. 

26.1 Set-up and basic implications. A sequence S = (Si, S 2} - - •) 
of r.v.’s is called a random walk (on R) if the sequence of its random steps 
X = (Xi = ^i, X 2 = Si — Siy • • •) at times » = 1, 2, ••- consists of 
iid r.v.’s Xi, X i} • • • . A random walk determines the sequence of its 
random steps, and conversely; similarly for the sub <r-fields of events: 

(B n = (& (Xl t - - * yXn) = . . . , S n ) } 

Qn = ®d+l, X n+ 2 } • • •) = (St(S n+1 — S n} ^n+2 

We denote by (Bco = (B(Zi, - - •) the smallest <r-field generated by 

the field U ® n andC = O C n is the tail <r-field of the sequence X; it is lm- 

»-1 n-l _ 

portant to realize that, in general, 6 is not the tail (r-field O (S n +i y Sn+ 2 , 

n—1 

.• •) of the sequence S. 
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We shall frequently adjoin 5 0 = 0 to the random walk so that it will 
become (•S'o, •S'i, A, • • .) with steps X n = S n — S n ~i y » = 1， 2， • . . • 
Intuitively it means that the random walk starts at time 0 at the origin. 
We could also make it start at some x ^2 Ror choose to be a r.v. If the 
random steps obey a law £(X) with only values 士 nd } d > 0, n = 0, 
1， ••- , then we have a very simple Markov chain with countable state 
space, 0, 士 土 2d } • • • ， and initial position 0, or some n^d y or a r.v. 
with law £(X). It is strongly recommended that the reader interpret 
the corresponding concepts and results in III of the Introductory Part 
in the case of random walk theory, as found in this section. 

The common law of the random steps will be denoted by £>(X) y its 
d.f. on R and corresponding pr. distribution on the Borel line will be 
denoted by the same symbol F y and its ch.f. will be/. D./.'s and cor¬ 
responding pr. distributions of their sums 6 , n - << positions M of the random 
walk at times », will be denoted by F n and their ch./.'s are/ n , » = 1, 
2, . . • . If £(X) degenerates at 0 then the random walk stays a.s. at 
{0}; from nowon we exclude this trivial case. Note that if £>(X) degenerates 
at « 〆 0 then the random walk moves a.s. by degenerate steps a from 

a.s. a.s. 

na to (» + 1 )«, « = 1 , 2 , • • • ， and S n - > + co or S n - > — co accord¬ 

ing as a > 0 or a < 0 . 

We distinguish two types of common laws £>(X). Let Ld = {nd: n = 0 y 
士 1 ， 土 2 , • • •} bea lattice of span d> 0. We say that is L^-dtstributed 

• +°° . . • 
if 53 P(X = nd) = 1 and there is no lattice of larger span d' > d with 

n — 00 

this property; according to the remark following 14.1a such a distribution 
occurs if and only if \f{u) | = 1 for some « 〆 0. If there is no J > 0 such 
that X is Ld-distributed, we set d = 0 y Lo = R y and say that X is Lq- 
distributed; thus X is L 0 -distributed if and only if \/(u)\ < 1 for all « 〆 0. 

We now examine basic implications of the above set-up. 

Possible values and states. We say that ^ € i? is a possible value of 
a r.v. X if P(X C > 0 for every neighborhood V x oix. We say that 
^ is a possible state of the random walk S = (S\ y 6 * 2 , • • •) if for every 
given neighborhood V x of x there is an » = n{V^) such that P( 6 * n €1 ’*) 
> 0. In either case, it suffices to consider neighborhoods of the form 
V x = (x — e y X + e). Let II, denote the possible states of the random 
— 00 
walk S t let II n be the set of possible values of S ny and set 11„ = U II„. 

n™l 
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a. n, contains II 0 and is closed. 

Proof. The first assertion results at once from the definitions. As 
for the second one, since —> a? as w —> <» implies that, given c > 0, 
for m sufficiently lai^e (x m — 1/m, x m + \/m) C (a; — c, x cj, it 
follows that when the x m are possible states there is an n such that 

— ^| < c) = ^(l^n — < 1/m) > 0. 

We say that a; is a discontinuity value of X if P{X = x) > 0. Clearly, 
the set of discontinuity values of X is the set of jumps of the discon¬ 
tinuous part ofFx** of the d.f. Fx. 

h. If x andy are possible values of independent r.v.'s X and Y respec¬ 
tively^ then x y is a possible value of X Y. 

If x andy are discontinuity values of independent r.v.'s X and Y respec¬ 
tively then x y is a discontinuity value of X Y, and all such values 
of X 七 Y are of this form. 

The first assertion obtains by 

P(\X +Y-{x+y)\<e)^P{X-x\<e/l) XP(\Y-y\<e/2) >0 
and the second one results from 

(■Fx * F Y ) d = F x d * F Y a . 

A. Possible values theorem. Let X be Li-distributed, with d ^ 0. 

(i) If neither X ^ 0 a.s. nor X ^ 0 a.s. then when ^ > 0, n v = Ld 
and when d = 0 y Tl v is dense in L 0 = R. 

(ii) If either X ^ 0 a.s. or X ^ 0 a.s. then when d > from some n on^ 
nd or —nd y respectively ^ belong to II C and when d = 0 y for every given c > 0, 
from some a; > 0, II 0 intersects (x y x e) or (—x — c, —x) y respectively. 

Proof. We use b without further comment. We can assume that 

= Xi has a positive value a so that *92 = Xi + Xi has positive 
value 2a\ otherwise, we change X into —X. Thus, it suffices to prove 
the theorem when there are positive values a < b. We follow Feller. 

1°. S&t'in = [na y nb). For n ^ rii> a/{b — a), [na y {n + \)a) C Jn 
hence Ujn = [«i^, «) and every x ^ n\a belongs to some of the J n for 

n ^ «i. Since the « + 1 points na + k{b — a) y k = 0 y ••- , n y belong 
to n» and subdivide J n into intervals of length b — a y every x ^ ma is at 
a distance at most {b — a)/2 from a member of II». 

2°. Suppose that for every given c > 0 there are possible values 
(0 <)a < b with b — a < t. Then X is £o-distributed for otherwise 
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every II n hence n» is contained in La for some fixed d > 0 and we reach 
a contradiction. 

If X ^ 0 a.s” the assertion in {it) for J = 0 follows from 1°. 

If neither X ^ 0 a.s. nor X ^ 0 a.s. then X has a possible value 
c < 0. Given c > 0, it follows from 1° that for arbitrary x and suffi¬ 
ciently large n there is a_y C n„ belonging to {—nc + x y —nc + x + e). 
But^ + nc also belongs to II B . Thus, every interval of any given length 
e > 0 intersects n» so that II B is dense in Z, 0 = ^ and the assertion in (/) 
for J = 0 is proved. 

3°. Suppose now that whichever be the possible values (0 <)a < 
there is an c > 0 such that b — a ^ c; we may assume b — a <1tiox 
some a and b. Then the set J n II» consists of points na + k{b — a), 

々 = ()，••• , n. Since (« + l)a is one of them, they all are multiples 
oi b — a. But for any c C n», for n sufficiently large J n has a point of 
the form c + k{b — a) so that c is also a multiple oi b — a. Thus X is 
Lrf-distributed with some d > 0 and the proof is completed. 

Corollary. Let X be La-distributed with d H If neither X ^ 0 
nor X ^ 0 then the set of all possible states of the random walk coincides 
with Ld. 

Follows at once by a. 

From now on, we take for 0 the set 12 = R m of all numerical sequences 
x = (^i, X 2 , • • •) and for the <r-field of events the cr-field (B of Borel sets 
in R°°, that is, the cr-field generated by the class of all cylinders of the 
form C{Ai X ... X J n ) } » = 1, 2, • • • , where the A's are linear 
Borel sets. This choice does not restrict generality yet permits to avoid 
possible ambiguities, say, about “translations.” 

SLLN and 0-1 laws. 

According to 17.4.4° 

ForO < r < 2,4r L ( 兄一 ^ w/VA a k = 0 or EX according 
n]' T . 

as r < \ or r — 1, if and only if E|Jf| r < 00 . 

For r = 1, we have Kolmogorov strong law of large numbers, SLLN 
for short, which can be completed as follows (see also 34.4). 

B. SLLN. Let EX exist. Then SJn — EX. Conversely, if 
S n fn — 8 -^ c necessarily a constant {finite or infinite) then EX = c. 
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Proof. It suffices to complete SLLN by considering the cases of 
infinite EX and c. 

Let EX = +«, that is, EX + = + ① ， EX~ < <» and let X k (a) = X k 

• n 

or a C. R according a.s X k < a or Xk ^ a. Set S n (a) = 2Z Xk(a). Since 
EX(a) < oo, 

SJn ^ S n {a)/n-^ EX{a) 

hence, letting a T 00 so that, by monotone convergence theorem, 

EX (a) I EX = + 00 ，we obtain S n fn - > EX = +<»; similarly for 

EX = — oo, or change X into —X. 

For the converse, if c = + « then, by what precedes, 

± = i ±X k +- EXf — + oo + 五 X- 

n jfe_i n n Cl 

so that EX + = + oo, hence EX = + « since EX exists; similarly for 
c = — «, or change X into —X. 

SLNN utilizes fully the iid property of the summands. Independence 
alone yields as we know (16.3B). 

Kolmogorov zero-one law. On a sequence of independent r.v.’s tail 
events have for pr. either 0 or 1 and tail functions are degenerate. 

This zero-one law, while applying to X = (J^i, Xi, • • •)> does not 
apply to the random walk S = {Si y S 2i • • •). Yet, the iid property of 
the summands implies “exchangeability”，and a new zero-one law will 
apply to S : 

We say that a sequence X = (X iy X 2y •••) of r.v.*s is exchangeable^ 
or that the r.v.’s Xi, X 2} • • • are exchangeable if the distribution of X 
is invariant under all finite exchanges of its terms or, equivalently, of 
their subscripts; in symbols, for every n and every one of the «! permuta¬ 
tions co n of (1, • • • , n) into ( 々 i, • • • ， k n ), 

£(X) = £(co n X) = £(X kli .• . , Xk n y Xn+ly ...) 

We say that a measurable function g(X) is exchangeable if it_is invari¬ 
ant under all permutations w n of its arguments: g(a n X) = g(X), » = 1, 
2, .. . ; in particular, an event on X is exchangeable if its indicator is ex¬ 
changeable. Clearly, on X every tail event and every ta|I function are 
exchangeable. In fact, by the iid property of its terms, X is exchange¬ 
able while, for every n, the sequences (^n, *S , n +i, - - •) are invariant under 
permutations G) n of (1, • • • , n). Thus, the second assertion below fol¬ 
lows at once, while the first one results directly from the definitions: 
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C. (i) On X, exchangeable events form a a-field. 8 and exchangeable func¬ 
tions are Z-measurable. 

(ii) On the random walk S corresponding to the sequence X of iid steps, 
tail events are exchangeable {belong to 8) and tail functions are exchange¬ 
able {are Z-measurable). 

In general, tail events and tail functions on S, say [^n G A n i.o.] where 
An are linear Borel sets, liminf S ny limsup S ny while exchangeable, are not 
tail events on X and Kolmogorov zero-one law does not apply. Yet 

B. Hewitt-Savage zero-one law. On a sequence of iid r.v.’s ex¬ 
changeable events have for pr. either 0 or 1 and exchangeable functions are 
degenerate. 

To prove this theorem we require an elementary measure-theoretic 
proposition. Let A B = AB C + A C B. 

d. Approximation lemma. Let (12, Ct, P) be a pr. space. If a field 3D 
generates d then for every divert A and every c > 0 there is a D C •幻 
such that P{A ^ €. 

For, clearly, the class of all sets A with the asserted property 
contains 3D and it is easily verified that this class is monotone; thus, by 
1.6A, it coincides with ft. 

The approximation property can be restated as follows. Let A 
and c n 1 0. There are D» G 2D such that P{A A D n ) ^ «n —> 0, that is, 
P{AD n c ) —> 0 and P(A c D n ) —> 0. Therefore, PD n PA since PA = 
尸槪 + PADn c = PDn- PA c Bn + PAD n c . 

Proof of B. In our case 2D = U ®n so that, given an exchangeable 
event A (in fact, any event) there is a sequence B n C with 
P{A A B n ) —> 0 hence PB n PA ; we can and do select ki < k 2 < • • •. 
Let C n be the events obtained from B n by the permutation of (1, ••- y k n , 
々 „ + 1 ， ••- y 2k n ) into (是 „ + 1 ， • . . ， 2 々 „ ， 1 ， . . . y k n ); thus, (B n G = 

®(Xi, • • • , X kn ) implies C n G Qk n = ®(Xjfe„ + i ， X kn + 2 , • • •) and, (B fcn 
and 6jfe n being independent so are B n and C n . But this permutation leaves 
the distribution of X invariant while being exchangeable, remains the 
same and A L B n \s changed into ^ A C n so that 

P(JAC n ) = P{ALB n )-^0 
hence PC n —> PA ; also 

P (/ △ B n C n ) ^ P (/ △ 十 P{A A C n ) 0 
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hence P(B n C n ) —> PA. Therefore, B n and C n being independent, 

PA^~ P(B n C n ) = PB n . PC n — 

so that PA = 0 or 1. The first assertion is proved and the second fol¬ 
lows. 

Consequences. 1. G A n i.o.] = 0 or 1, liminf S n and limsup 
S n are degenerate. 

2. Three alternatives. For a {nondegenerate at 0) random walk 
(*S ， i,*S ， 2 , • - •) there are exactly three asymptotic alternatives'. 

(i) S n — <» (drifts to — «>) 

(ii) S n —^ +oo {drifts /o + «) 

(iii) — oo = liminf S n < limsup = + 00 a.s. {oscillates between 
— oo and + oo). 

Proof. Since liminf S n = c a.s. where the constant c may be finite or 
infinite, and (S 2 — Si, S 3 — Si, * • •) has the same distribution as (Si t 
S 2 , • . •)，we have 

liminf (S n — Xi) = liminf S n a.s. 

hence c = Xi c a.s. The case Xi = 0 a.s. being excluded (that is, is 
excluded the trivial alternative the random walk stays at 0 a.s.), we must 
have c = +<» or c = —<». Thus a.s. 

either liminf S n = — <» or lim S n = liminf *S , n = + 00 

and, changing X into —不 a.s. 

either limsup Sn = + 00 or lim S n = limsup S n = — 00 . 

The three alternatives assertion follows. 

Random times. 

Translations 0 n on X = (Xi, • • •) are defined by 

O n X = d n (Xl y Xi, . . .) = (X» +1 , X n + 2 y ...)，《= 1 ， 2, ... 

so that _ 

the terms of d n X are iid with same common law £>{X) as the terms of X. 
Thus, 6 n X has same distribution as X and therefore X is said to be 
stationary (see also 33.3). The random walks corresponding to X and to 
6 n X are, respectively, (^i, S 2 , - - •) and d+i — *SV ， *S , n +2 — *S , n , • • *) with 
same distribution, and the cr-fields • , S n ) = <B n and (B(*S , n +i — *S' n , 

*y „ +2 — S n - ■ ■) = (B(Xn+i, Xn+2, • . .) = ©n are independent. 
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These properties extend to “random times” 一 times «(= 1, 2, • • •) be¬ 
coming “degenerate” random times, as follows (see also 39.2 and 41.4 
taking therein T = (1 ， 2, . . .）and b = «). 

Given a nondecreasing sequence ((B n ) of sub cr-fields of events, a 
measurable function r to (1 ， 2, •••，<») is a ((R n )-time if [r = «] C ® n , 
n = 1, 2, ••- ； if there is no confusion possible, we say that r is a random 
time. Clearly, a random time r is (B T -measurable with cr-field (B T = 
{events B: B[t = «] C ® n , « = 1, 2, • • •}. If r < « a.s., we define 
X T +k{<j>) by X r ⑷ +J fe(w) so that the X T +k are r.v.’s, 々 = 0 ， 1， .... Then 
the cr-field (R(X t +i, X t + 2 , . ..) is denoted by e T and translation by t oi X 
is defined by 

dr(X 1} X 2 , • • •) = (X t+1 ， X t+2 ，...）. 

The above properties of translations by n remain valid as follows. 

C. Random times translations. If a ((R n )-time r < <» a.s. then the 
a-fields (B t and 0 T are independent and the sequences X = (Xi, X 2 ,...） 
and d T X = {X T +u X t + 2 , - - •) have same distribution. 

Proof. The assertions mean that, for any pair of events B T C ®r and 

B C] ®oo = ®(^i> ...), 

(1) P{B r [QrX G B]) = PB r P{X G B). 

By definition, 6 T = d n on [r = n\ hence 

P(B t [ 9^X CB])=t P(B T [r = n][d n X C B]). 

»_1 

Since B t [t = «] G ® n and d n X is e n -measurable, independence of CB n and 
6 n implies 

P(B t [t = nWX G B]) = P(B t [t = n\) • P(d n X C B). 

00 一 _ 

Since 5Z P(r = «) = 1， and 6 n X has same distribution as X y (1) becomes 

n —1 

P{B r [e^X CB]) = i P(B T [r = n}) - P(X CB) = PB r - P(X C B) 

n»l 

and the proof is terminated. 

The above argument is characteristic of extensions of properties of 
times n to random times t < 00 a.s •: use the definitions and the asserted 
property 一 valid on [r = «],« = 1, 2, • • •. For example, upon setting 
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ri = r < oo a.s. then defining r 2 byr 2 = r but on 6 T iX in lieu of X, and 
so on, it easily follows that 

^Tiy *^Vi+T2 > • • • ctirc ttd ms. 

By the same procedure but with E in lieu of P, additivity of expecta¬ 
tions, which for rand walks becomes ES n = nEX, extends to r < « 

a.s. in lieu of «， upon using Et = nP(r = «).= 5Z P(r ^ as fol- 

n«l n—1 

lows. 

2. Wald's relation. ES r = Er • EX in the sense that if the right side 
exists so does the left one and then both are equal. 

Note that the right side exists when Et < ① and EX exists, or when 
Et = co and EX is finite or EX ^ 0 or EX ^ 0. 

Proof. Let 0 ^ EX ^ oo and Er ^ co. Then 

ES t = H E(S n [r = «]) = X) X) E(X k [r = n]) 

n—1 n—1 k 誦 1 

=tt E(X k [r = n]) = ± E(X k [r ^ k] 

n 種 k 

=S EXkP[r ^k] = EX- Er. 

k 麵 1 

The last but one equality is due to the fact that [r < k\ belongs to 
hence so does its complement [r ^ 々 ]，while Xk is ejfe_i( = (B(Xjfe, 
Xk+i, • - •))-measurable, and (S>k-\ and Qk-i are independent. 

Changing X into —X the same relation holds. The other cases follow 
from EX = EX+ - EX~ with EX+or EX~ finite. 

We shall frequently encounter the hitting or first visit time ta of a 
linear Borel set A by a. random walk (^i , 〜 •..）： 
ta(w) = min{«: *S' n (w) G for o> G U[*S"n G A\ andrxCw) = <» oth¬ 
erwise. Clearly ta is random walk time, since for every n, 

[ta = n\ = [tS 1 )；； G A c fork < *S , n G G ® *^n). 

Similarly for other random times we shall encounter: In general, the 
fact that they are random walk times will be clear from their definitions. 

Andersen equivalence. 

“Finite exchangeability” alone suffices for a basic Andersen result for 
“finite fluctuations. M We set X n = (Xi, - • *, X n ) and say that the ran- 
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dom vector X n is exchangeable or that its components Xi，• . .， X n are 
exchangeable if the distribution of X n is invariant under the n ! permuta¬ 
tions of its components. We say that a measurable function g(X n )= 
g(Xi, - - *, X n ) is exchangeable if it is invariant under the n \ permutations 
of its arguments. 

e. Andersen equivalence lemma. Let X i} • . 、 X n be exchangeable 
and let • • •, S n be their partial sums S 0 = 0, ^ = X i} . . •， S n = 

Xi + • • • + X n - 

If v n is the {random、number of positive terms in (S 0} - - •, S n ) and r n is 
the {random) time of occurrence of the first maximum of its terms y then v n 
and r n are identically distributed. 

This result is an immediate consequence of a combinatorial lemma 
due to Feller whose elementary proof, modified by Joseph — as reported 
in Feller, follows. 

f. Combinatorial lemma. To each permutation (xk lt - - •, Xk n ) of 
(■Vi,. . . ， x n ) associate the sequence 0, x 、，^ + • • • + Xk n of its partial 
sums. Let m = 0 ， 1， . . •，《. 

The number N m of permutations with exactly m positive sums is the same 
as the number T m of permutations in which the first maximum of partial 
sums occurs at time m. 

Proof. Let N m k and T mk correspond to N m and T m when Xk is omitted 
in (xi, - - •, x n )- We use induction: The assertion holds for n = \ since, 
clearly, Xi ^ 0 implies No = To = l and Ni = T\ = 0 while > 0 im¬ 
plies No = To = 0 and Ni = Ti = 1. Suppose it holds for « — 1 ^ 1, 
that is, N m k = T m k for k = 1， . • .，《 and m = 0, •••，《 — 1; since trivi¬ 
ally N n k = T n k = 0, it also holds for m = n. 

We use the fact that by fixing Xk and permuting the n — \ remain¬ 
ing x*s then varying k = 1， • . .， 《， we obtain the n\ permutations of 

Xl，. • ^n) • 

If s n 全 0 then N m and T m depend only on 方 1 ，…，尤》一 1 hence, by 
induction hypothesis, 

N m = 'll Nmk = H Tmk = Tm. 

n 

If j n > 0 then N m = H As for T mi consider all (x ki x kli • • •, 办„- 1 ) 

k 種 1 

starting with Since 心 + . . . + ^j fen _ 1 > 0 the maximal terms of 
their partial sums cannot be s 0 . Since the first maximum occurs for 
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w ( = 1， ...，”）if and only if the first maximum of partial sums of (叫， ...， 
^jfe n _i) occurs for m — 1, we have 

n » 

N m = ^2 T m -.i t k = T m . 

^■1 k^l 


By using an argument formulated by Spitzer, instead of proving e we 
can prove the more general. 

D. Equivalence theorem. Let g(X n ) be an integrable function of an 
exchangeable random vector X n = (Xi, - - •, X n ). 

^fg(X n ) is exchangeable then, fork = 0,1, … ，”， 

Eis(X n )I {Vn ^) = £(^( m 】) ； 

in particular ， 

E(e«l lrn _ k ') = E(e^I lTn . k] ) } u C R } 

and 


P[v n = k] = P[r» = k\. 

Proof. Let F n be the d.f. of X n and X n = (xi t - - •, ^ n ). 

Denote by S summations over the n \ permutations w n of (1, 

Since g(X n ) is exchangeable 

E{g{Xn)I[y n -k]) = J* f (^n)/[. n -Jfe] {a n Xn)dF n {Xr) 
and, by the combinatorial lemma, 

fc] (®n*^n) = 27[ rn »^ ；】 (C0 n iV n ). 

Thus the first sum equals the same sum but with r n in lieu of v n hence the 
expectation equals the one with r n in lieu of v n . The particular case with 
g(X n ) = e iuSn follows and then, setting u = 0 , the last assertion — which 
is that of e, results. 

By means of his equivalence, Andersen obtained his first limit theorem 
for finite fluctuations, namely 

Arcsine law. Let 6 * 0 ( = 0), S\ y • • • be partial sums of iid summands 
Xiy I 2 , • • • with common law £(X). 

If £(X) is symmetric with P{X = 0) = 0 then 

2 

P{y n /n < *v) — - ArcsinV^，0 ^ ^ ^ 1. 
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The Arcsine law was discovered by P. Levy in his study of Brownian 
motion, then obtained by Erdos and Kac as a limit theorem for sums of 
independent random variables with finite second moments and obeying 
Lindeberg condition (see also Chapter XII). Andersen’s result which 
does not require second finite moments was unexpected and drew atten¬ 
tion to his approach. 

The proof is based upon the following considerations. The event 
[p n = k\ consists in the occurrence of events 队 > So，• • • ， *5^ > A-i] 
and [A+i — S k ^ 0 y S n — Sk ^ 0]. The first one belongs to (B(Xi, 
... ， Xk) and the second one belongs to ©(Xi+i, • • •, X n ) and these two 
cr-fields are independent. Furthermore, (Xk+i y • * X n ) is distributed as 
(Xi,. . . ， X n -k). It follows that 

P{v n = k) = P{y k = k)P{v n -k = 0) 
and, by Andersen equivalence, 

(1) P(r n = k) = P(.rjfe = k)P{T n - k = 0). 

Let 


★ m - 1 ⑽！ (2(n - k ))! 

PnW - 22 n _ ^)! („ _ ^)1 

so that 


々 = ()，...，《， 


pn(k) = pn{n - k), Z pn(k) = 1. 

0 

We prove by induction that 

(2) P{v n = k) = pnik). 

For n = 1, we have 


P{n = 0) = P{v x = 1) = ^ = 仏⑼ = 声 i(l). 

If (2) holds for « — 1 hence, by (1), P{v n = k) = p n (k) for k = 1 ， . ♦.， 
« — 1, then 

P{v n = 0) + P{v n = n) 

=1 — S P(v n = k) = l — S p n {k) = ⑼ + pn{n). 

^■1 k 函 1 

Since the hypothesis about £(X) implies easily that P{v n = 0)= 
P(v n = n), it follows that 
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P{v n = 0) = _p“0) = P(v n = n) = p n {n). 

Once (2) is proved, the Arcsine law follows by elementary computations 
using Stirling’s formula (see Introductory Part, 7). 

We shall use the foregoing basic implications without further com¬ 
ment. 

26.2. Dichotomy: recurrence and transience. We recall that x e R 
is a possible state of a random walk (^, *S" 2 , • • •) if for every neighborhood 
F Xi there is an « = n{F^) such that P{S n G ^x) > 0. We say that x is 
a recurrent state of the random walk, if, for every V Xi P(S n c V x i.o.) = 1; 
as usual “i.o.” stands for “infinitely often,” that is, for infinitely many n y 
and “f.o.” for “finitely often” will stand for denial of “i.o.”，that is, for 
“at most finitely many Thus, to say that x is recurrent is equivalent 
to P(S n C Vx f.o.) = 0. Clearly, a recurrent state is possible and it suf¬ 
fices to consider neighborhoods V x of the form (a; — c, a; + c). 

b.. If a random walk has a recurrent state x then all possible states are 
recurrent. 

Proof. If ^ is a possible state, that is, for every c > 0 there is a 
k = k(e) such that P(! A | < c) > 0 then a; — is recurrent: For 
then, 

0 = P(| — a; I < 2c f.o.) 

2 尸 (I — y I < <> I *S' n+ jfe — •S'jfe — (a; — _y) I < c f.o.) 

=P(| ^ I < c)P(| *S' n — (a? — jy) I < « f.o.) 

hence P(| *S' n — (a; — _y) | < c f.o.) = 0 and a; — is recurrent. It fol¬ 
lows that every possible state y = x — {x — y) is recurrent and so is 
■v — a; = 0. 

Thus we are led to a dichotomy: A random walk is recurrent if one 
hence all its possible states are recurrent, or it is transient if none of its 
possible states is recurrent. 

As usual, £(X) denotes the common law of the iid random steps 
Xi, X 2 , • • • which generate the random walk. 

A. Recurrence theorem. Let X be La-distributed with d ^ 0. 

The random walk is recurrent if and only if one of its possible states is re¬ 
current^ and then Ld is the set of its states. 

Proof. If the set (Jl of recurrent states is not empty then, by a, the 
random walk is recurrent while the converse is trivially true. 
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Let the random walk be recurrent. Always by a, (Jl is closed under dif¬ 
ferences and 0 C <Jl. It follows that a; C <=» — x = 0 — 故況 , and (R is 
an additive group. Furthermore, (Jl is topologically closed since, for any 
given F X) if recurrent x n — x then, from some n on, x n C V x hence 
P(S n C Vx i.o.) = 1, and x is recurrent. Since the trivial case of random 
walks degenerate at 0 is excluded, (Jl 9 ^ {0} and the only foregoing sub¬ 
groups in R are of the form (S{ = Ld> with d' ^ 0. d' = 0 then d = 0. 
If ^ > 0 then Ld> C La hence d ^ d'. Suppose d < d' so that there is a 
possible state which is not recurrent. This contradicts the hypothesis 
that the random walk is recurrent. Thus d = d\ and the proof is termi¬ 
nated. 

Corollary. Let X be Ld-distrtbuted with d H 
Either P(S n G Vx i.o.) = 1 for all bounded open sets V intersecting or 
P(S n C V x i.o.) = 0 for all such V. 

B. Dichotomy criterion. Let X be Ld-distributed with d ^ 0. 

00 

(i) If Yi P(S n C J) = 00 for some bounded open interval J, necessarily 

n»»l 

intersecting Ld, then the random walk is recurrent. 

00 

(ii) If ^ P(S n G J) < 00 for some bounded open interval J intersect- 

n»»l 

ing Ld } then the random walk is transient. 

Proof. By Borel-Cantelli lemma, the hypothesis in (ii) implies 
P{S n C J i.o.) = 0 for some bounded open interval J intersecting Ld so 
that there is a possible state which is not recurrent hence, by A, no state 
is recurrent and the random walk is transient. 

00 

Let X P(^n C J) = 00 for some bounded open interval J with length 

n —1 

I J |. Then, for every c < | J |/2 there is a J: = (a; — e, x + e) C J 

00 

such that X P{S n C Jx) = 00 . Consider the time r of the last visit 

n™l 

by the random walk to J* if any, and set r = 0 if none and r = <» if in¬ 
finitely many. Thus, for k = 1 ， 2, • • • 

An = [t = Tl\ = [*y n GI Jxy Sn+fc C J* for all 走 ]， ^ = 1> 


and 


= [r = 0] = P(S n C Jxfor all n) y 
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hence 

P(t < oo) = P(S n G Jx f.o.) = X PA k . 

n*«0 

Since, for « ^ 1, 

PA n l P{S n C Jx, |^» +Jfe - ^»| ^ 2e foralU) 

= P{S n G Jx)P(|^n +fc - ^n| ^ 2e foralU), 

it follows that 

1 ^ P(Sn e Jn f.o.) ^ PQS k \ ^ 2e for all k) £ P(S n G J,). 

n»l 

oo 

Thus, ^ P(S n G Jx) = 00 implies that for every e > 0 

n™l 

(1) P(\S k \^ 2e for all k) = 0. 

This relation implies recurrence of 0 hence of the random walk, as fol¬ 
lows. 

Take Jo = (—c, +c), let = (—5, 8) with 0 < S < e, and define the 
corresponding An as the A n were defined but replacing x by 0. Note 
that, by (1), PA^ = P(| *9* | ^ € for all k) = 0. In fact, all PA^ = 0 
for « ^ 1: For, as 5 | c, 

A n °,$ = [5n C Ji, S n+ k c Jo for all k] | 

hence —> PA^ and, by (1), PA^ = 0 since 

P(S n G Ji, t Jo for all k) 

^ P{S n €1 J«> |*y»+jfe — ^ c ~• 5 for all k) 

= P(S n C J 4 )P(|^n| ^ - 5 for all k) = 0. 

Thus, 

oo 

P(S n G Jo f.o.) = E P^n° = 0 
»™0 

so that 0 is recurrent, and the proof is completed. 

Corollary. If for some bounded open interval J intersecting Li 

oo 

[ P(S n C J) is either infinite or finite t then the same holds, respectivelyor 

n™l 

all such ]. 

The elementary proofs of A and B are the original ones and are due to 
Feller, while the proof of C is due to Chung and Ornstein and that of D 
is due to Chung and Fuchs as modified by Feller• 
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The next proposition provides us with a dichotomy criterion in terms of 
one numerical characteristic of £(Z), namely in terms of EX provided it 
exists. 

C. Expectation criterion. Let EX exist. Then the random walk is 
recurrent if and only if EX = 0. More precisely 

(i) If EX = 0 then the random walk is recurrent and a.s. 

—co = liminf S n < limsup *S' n = + <» . 

(ii) If EX > 0 or EX < 0 then the random walk is transient and 

S n -. ’•*•> + oo or S n - ° —oo, respectively. 

To prove this proposition we need the lemma below; we introduce 
•Jo = 0 and write I{A) in lieu of I a. for any event A. 

b. For every c > 0 and every integer m 

士 £ PQS n \<mc)^ £ PQS n \ < c). 

w™0 n n 0 

Proof. Let the right side be finite; otherwise there is nothing to prove. 

CO 

Let J be an interval of length f and let i/ = Y, /d C J) be the number 

n«l 

of visits to J by the random walk (^i, *y 2 , •. .）so that their expected 

oo 

number is Ev = ^ P{S n C J). Set r = min{« ^ 1: C J} when 

n«l 

this set is not empty and r = <» when it is; r is the time of the first visit 

oo 

to J and Ev = ^ E{vl{r = n)). On [r = n] y I(Sk C J) = 0 for k 〈 n 

n»»l 

while I{S n C J) = 1 hence 

00 CO 

vI[t = n]= Y, C J) = 1 + Yi C J) 

fc—n+l 

=1 + £ I((S k - S n ) + G J) 

fc—n+l 

^ 1 + £ I(\S k -S n \<c) = l + J ： /(|^| < c) 

A ； »n+1 

=£ im < c) . 
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Since [r = n] and Sk are independent when k > n t \t follows that 
Ev^ £/p(t = «) £P(|^| < f)\ ^ £ P(\S n \ < c). 

n»0 ^ 

Therefore, 

£ P(S n GJ) ^ £ P(\S n \ < c) 

n»0 n«0 

since P(*S , 0 C J) = 0 unless 0 C J when this inequality holds trivially, 
term by term. Upon replacing J by J/ = |jV, (J + l)f) and summing 
overj = —m, —m + 1， * . . ， w — 1， the asserted inequality 

P(\Sn\ < me) P(|^n| < C) 

n»0 »—0 

obtains. 

Proof of C. By the SLLN, if EX > 0 then S n /n —*-> EX > 0 hence 

S n > + 00 and the random walk cannot be recurrent; similarly for 
EX <0. ” 

Let EX = 0 so that S n /n a t > 0 and, a fortiori, Sjn ■■ … P > 0 hence, for 
given c > 0 and n h sufficiently large, Pd^n) < nt) < 1/2. There¬ 
fore, for m/c ^ 

^ — = l/4c — n t /4m 

so that, by b with c = 1, 

oo 

[P(|*S" n | < 1) ^ limsup(l/4c — nj\ni) = l/4c —> <» 

n™0 mr*co 

as € —> 0， B applies and the random walk is recurrent. But then the 
(nondegenerate at 0) random walk cannot drift to 十 ① or to —① and 
the only asymptotic alternative is a.s. 一 ① =liminf S n < limsup S n = 
-j- oo . The proof is terminated. 

If a random walk obeys the infinite oscillations alternative it is not 
necessarily recurrent: Symmetric random walks, that is with £(X) sym¬ 
metric, obey this alternative and we produce now such random walks 
which are transient. 
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Let £(X) be nondegenerate symmetric stable, that is, with /(«)= 
exp(— <:|«卜)， c > 0,0 < 7 ^ 2. According to end of 25.2, F 1 exists and is 
continuous and 0 ^ F\x) ^ F f (0) with F’(0) > 0. Furthermore, £>(X) 
being stable, 


£(*SV» 1/T ) := £(^Q for every n = 1, 2, 


It follows that 


P(|^»| < 1) = PQX\ < l/n^y) 

=FW dx 〜 

CO GO 

so that.E 尸 < 1) is finite or infinite, according as » _1/7 is 

n—1. n—1 

finite or infinite hence, according as 0 < 7 < 1 or 1 ^ 7 ^ 2 . Thus, by 
B, our symmetric random walk is transient for 0 < 7 < 1 and recurrent 
for 1 ^ 7 ^ 2; note that EX does not exist for 0 < 7 ^ 1. 

Finally, we search for conditions for recurrence or transience in terms 
of the ch.f./ of £(X). (So far, they seem to provide the only approach 
for general random walks in euclidean spaces R n } « > 1.) In what fol¬ 
lows we use the immediate 

Parseval relation ： J / («) dFy(u) = J fy{X) dFx{X) 
which obtains upon integrating/x(«) = J e iux dF{x) with respect to F y(/), 

and two laws with 

triangular pr. density: 

= |(1 -誓 ) V 0,/(«) = 2 1 々 > 0， 

triangular ch.f.: 

/(«) = (l ~ ^) V 0, F\u) = 1 1 "/J— > 

D. Ch. f.’s and dichotomy. Letf be the ch.f. of the common law £(X). 
(i) The random walk is recurrent if there is a 8 > 0 with 


limsup 

m 


du 

- tf{u) 


(ii) The random walk is transient if there is a 8 > 0 with 


du 


一 


< 
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Proof. Let So = 0. 

I 0 . Parseval relation with triangular ch.f. and Fs n yields 

P (叫 < ^ £ (i - M) ,Fs n ( X ) = l -f du. 

Since (1 — cos hu)/hu 2 ^ ch for \u\ < \/h and some c > 0 and 


(Re 




-//(«) "ii- //(«)r 


it follows that 




、lth 


du 


du 


»»0 


ch 


TT J^l/h 1 — tf{^) 


Therefore, by hypothesis in (i), for \/h < 5, 

OO J 广 * 

L ^(l^nl < h) ^ — limsup 


du 


n »*0 


TT 


’- a 1 — tf{u) 


CO 


and recurrence obtains by B. 

2°. Parseval relation with triangular pr. density yields 




so that for |^| < 2/A hence (1 — cos hx)/^^ >1/3 


£/ 娜 „| <2/h) ^ / (l 


«[\ du 3 

h ) \ — tfiu) 1h J—h 


C du 

J-h T^Tf 


(/■(«) 


Therefore, by hypothesis in (ii), for h < 8, 


3 


二酬< 训岣二 • 


du 


n »*0 


一 tfi. u ) 


< °° } 


and transience obtains by B. 

Corollary 1. If 、 for some 5 > 0, 

r du 




-/(«) 


00 
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then the random walk is recurrent. 

Corollary 2. If EX = 0 then the random walk is recurrent. 

Follows by elementary computations from the fact that, given e > 0, 
EX = 0 implies 0^1 — (Re/(u) < eu for \u\ < 8 sufficiently small. 

26.3. Fluctuations; exponential identities. We consider random 
variables defined on (So, . . S n ) t say, the number of its positive terms 
or their maximum or times of occurrence of this maximum, etc. We 
shall find the explicit form of their laws in terms of “exponential identi¬ 
ties.* * The method will be Fourier analytic. At its core lies a “Wiener- 
Hopf” factorization technique for the generating characteristic 1/(1 — tf) 
of the random walk (*9 0 , * . .)• 

In what follows, 0 < t < u R y A denotes a linear Borel set, and 


we set 


/a(«,/) = exp^ 


n J I 


iS n CA] 


e iu8 n i 


hence 


/a c (u y t) = exp|^ - 


^s n e^ c ] 


a. Factorization lemma. 




Results from 


:w = exp MT^y‘ 


expjE /⑷ 


/ n («)= 




We shall be dealing with Fourier-Stieltjes transforms of functions of 
bounded variation on linear Borel sets, of the form 


e iux dG{x) 
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with same affixes (or p and G, if any. Exactly as for characteristic func¬ 
tions, the uniqueness theorem p ㈡ G (up to additive constants) is valid, 
their products pp' correspond to compositions G*G' and, clearly, their 
sums and differences p ± p' are transforms of functions of bounded varia¬ 
tion G db G'. 


Pa{u, t) = Y, Pn(u)t n , Qa c {u, t) = Y, ?n(«)/ n 


where P 0 («) = q 0 (u) = 1 and, for « ^ 1, 


Pn(u) = Jj iux dG n {x) y q n (u) = J 
A. Unique factorization theorem. If 


e iux dG n (x). 


(0 Y'-tf(u) = Pa ( u ， ， ) 〜 (“〆)(H) = 以 (“， ，) 


Pa(u } t) =/a(u, t) or P A (u y t) =/a _ 1 («，/) 


Qa c (u, /) =//(«，/). 

Proof. Because of a, it suffices to show that if the foregoing relations 
hold for P 乂 (《，/) and QX e {u, t) then, for » = 0, 1, • • •, p n (u) = pn{u) 
and q n {u) = ^n(«), « C -K. 

Upon identifying the coefficients of the t n y (i) and (ii) then become, 
respectively, 


L Pk(u)q n - k (u) = 外 ’ ⑷孕二 -jfe ⑷ 


Y, pk(u)qn^k(u) = Y,PkW)q'n-k{u). 


We proceed by induction: The assertion is trivially true for n = 0. 
If pkiju) = pk(u) and qk{u) = q'kiu) for k = l } — 1 then in (1) 

and (2) the first n — \ terms in the left and right sums coincide so that 
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Pn{u) + ?»(«)= Pn{u) + ?»(«) Or p n (u) + 《:(《)= pn{u) + ^ n («). 

Thus 


Pn(u) — pn(u) = qn{u) — q n {u) OT p n {u) — pn{u) = q n {u) — ^n(«), 
that is, for u R, 

r r r r 

J e xux dH n {x) = ~ J c 


e iux dH n (x) or I e iux dH n (x) 


U c 


e iux dH n (x), 


where the functions H n = G n — G' are of bounded variation. There¬ 


fore, by the uniqueness property for Fourier-Stieltjes transforms, 
lAdH n = T IA°dH n so that both sides vanish. The assertion follows. 


This proof as well as B are due to Baxter. 

From now on, to simplify the writing, I e iuSn = E{e iuSn I{A)) will 

be denoted by E{e iuSn : A) and, when A is of the form [• • •] we shall 
omit the square brackets. The first visit time of A by (Siy Siy • - •) will 
be called hitting time of A. When r is a random time, forr = «, t n+T = 0, 
(0 < / < 1), « = 0, 1, • • • ; note that if r is a time of (*S'i, S 2 , * * *) then 
[ r = 0] = 0. 

B. Random times identities. Let r be a time of (^i, ^2, •. 


(i) The following identities hold: 


E{t r e iuS -) = t n E{e iu8n : r = n), 

n«0 

( r —1 \ co 

E t n e iuSn ) = t n E{e iuSn \ t> n) 
n—0 J »™0 

1 - Em 


— tfi u ) 


‘n™0 


(ii) When r = r A is hitting time of A then 


t n 


1 一 E(t T ^ exp iuS TA ) = /A^ l (u y i) = expj — H ? E(e iuSn : S n ^L 


fr A ： 


E[ y ： = /，(《,/)= exp^J Y. - E(e iuS -: S n G A c ) 


i n 
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Proof. The first identity in (i) results at once from the definitions. 
The second one results from 

( r— 1 \ oo / n—1 \ oo 

X) t n e iuSn J = H 五 (X) t n e iuSn i t = 々 )=E E{t n e iuSn \ r > n). 

n—0 J n—1 0 J n—0 

Since r is a time of the random walk 

/ n ^ iuSn， \ = E (t T e iuSr 23 ^t«(Sr+B-S r )\ 


= Eire^E^Z^e^j = £(/v-^)/(l - //(«)) 

and, replacing in 

1/(1 - //(«)) = E [z re^A + e(z re^ s A , 

Vn—0 J Vn—r j 

the third identity obtains. By the unique factorization theorem, it im¬ 
plies the two identities in (ii). 

Our main concern is with ^ = (0, ») hence A c = ( — °°, 0】， and we 
set /+ =/(0,oo),/- =/(-oo, o] so that 


/ +(«, i) = expj X - E{e iuSn : S n > 0) L 


/_(«，/)= expj X) - E{e iuSn x S n ^ 0) 


Corollary. If t = r (0 , m ) then 


— E{re^) =/+(«,/), £ £ re^ s - ) =/_(«,/)• 


C. Maxima times and positive sums identities. 

(i) If r„ is the time of occurrence of the first maximum of (S 0 , •S'i, * • S n ) 
then 

E(e {uSn : T n = k) = E(e iuSn : n = k) E{e iuS n -k ： r„_* = 0), 

00 

53 / n _E (e iuSn : T n = n) =/+(«,/), 
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H t n E{e iuSn : T n = 0) =/_(«，/)， 

M —0 
00 

L t n E{s Tn e iuSn ) =/+(«,j/)/_(«^), 0 < j ^ 1. 

M —0 

(ii) If v n is the number of positive sums in (So, Si, • • •, S n ) then the 
above identities remain valid when therein r is replaced by v with same 
subscripts. 

(iii) All above identities t including those in the above Corollary t remain 

valid when (0, <») is replaced by [0, «) provided r n is the time of the last 
maximum of (S 0} S i} - • •, S n ) and v n is the number of nonnegative sums in 
(^o, - - •, S n ) while »?„ > 0 and ^ 0 are replaced by *S* n 2 0 and 

S n < 0 in f + and in respectively. 

Proof. The identities in (i) are based upon a “sample space factoriza- 
tion ”： If M n = max(*S'o, - - •, S n ) then the first time this maximum occurs 
is T n = min{0 ^ k ^ n: S k = M n \ and, by the very definition of r„ = 
TnC^"l，• . • ， Xn)y 

[r n (X 1} - • •, X n ) =k) = [T k (X 1} • • •, X k ) = k][r n . k (X k+li - - •, X n ) = 0]. 

Since the last two events are independent and so are S k and S n — S k while 
S n — Sk has the same distribution as S n -k, it follows that 

E(e iuSn : r n = k) = E(e iuSk ： t* = 走） . E{e iuSn ~ k x t„_* = 0). 

Thus, upon multiplying by s k t n and summing over 0 ^ k ^ n < «, 

CO 

X) t n E{s Tn e iuSn ) = P(u, si)Q(u, t) 

M —0 

where 

00 

P{u, /) = Z t n E{e iuSn \ r„ = »), 

° n—0 

oo 

Q(u, /) = t n E(e iuSn : r„ = 0). 

fi™0 

For j = 1, the preceding relation becomes 

1 -tf{u) = P (“， t)Q{Ui 



[Sec. 26] INDEPENDENT IDENTICALLY DISTRIBUTED SUMMANDS 393 


while r n = n implies «?„ > 0 and r„ = 0 implies S n ^ 0. The unique 
factorization theorem applies so that 

P{tiy t) =/+(«, t)y Q(u, t) =/_(«, /) 

and the identities in (i) follow. 

By Andersen equivalence, the sample space factorization for the times 
of first maxima is equivalent to the far from obvious sample space fac¬ 
torization for the numbers of positive sums: 

[ Vn (X 1} - • X n ) =k} = [ n (X 1} • • •, X k ) = k]\v n _ k {X k+u • • X n ) = 0], 

and (ii) for positive sums identities follows. 

Finally, by using in the unique factorization theorem [0, «) in lieu of 
(0, w), (iii) results from the fact that all the foregoing arguments con¬ 
tinue to apply to the corresponding r„ and v n . 

The following important identity, known in various guises and with 
various degrees of generality, has its origin in the basic Spitzer identity 
below (Pollaczec, Spitzer, Kemperman, Port, etc.). 

D. Maximum time and value identity. If M n = max(»?o, • • .， S n ) 
and r„ = min{0 ^ k ^ n ： S k = M„}, then 
00 

t n E{s Tn e iuSn+ivMn ) = / +(« + Vy :/)/_(«,/), 0 < j ^ 1. 

n—0 

Proof. Since r n = k <=> M n = Sk, by sample space factorization, 
E(eiuSn^M n： Tn = 走 ) = p % ^i(u-\-v)Sk^iu( k Sn' mm Sk) : = 

= E(je i<M+v ) Sk : Tk = k) • E{e iuSn ~ k \ = 0). 

Upon multiplying by s k i n and summing for 0 ^ ^ ^ < <», it follows 

that 

CO 

IZ t n E(s Tn e iuSn ) = P(u + Vy st) Q{uy /), 

n—0 

where P and Q are the functions introduced in the preceding proof and, 
as therein, the unique factorization theorem yields the asserted identity. 

Particular cases. 1 °. For o = 0 we obtain the last identity in C(i). 

2°. For j = 1 and « = 0, changing v into u, we obtain the Pollaczec- 
Spitzer identity : 

5Z t u E{e iuMn ) = expi 23 — -E (^' uSn+ ) >• 

n—0 I n—1 Tl J 
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It was first discovered by Pollaczec but remained unnoticed until re¬ 
discovered by Spitzer. 

3°. For j = 1, interchanging u and v, we obtain 

co 

Z o i n E(e iuMn+ivSn ) = / +(« + V, /)/_(», /). 

Upon setting w = u + v then changing w into «, and v into —v, it be¬ 


comes 


Z 為 = f + ( Ui /)/_( — £；, /). 


Finally, upon multiplying by 


T^-r exp 焯 f 取 > 。) }. -p{U 取 ^°)}> 


we obtain 


L o rE(e iltM ^ +i ^ M n-SrO) = expi L ^(Ee iuS ^ + + Ee ivS ^~) 


£ o ^giuMn+MMn-Sn)) = expjz f ( 五一 + + Ee ivS ^~ - l|, 

and this is the basic Spitzer identity in its initial form. 

extension. The basic exponential factors/+ («, t) and/_ («, t) may 
still have meaning when « C ^ is replaced by complex z. In fact, 


f+(Zy t) 


exp {»C 


E(e^ s -: S n > 0) 


is bounded and continuous for Zz I ^ and regular for Zz > 0, 
f-{zyt) = exp{[ E(e izSn : S n ^ 0)| 

is bounded and continuous for'iz < ^ and regular for Zz < 0. 

Thus the question arises whether the identities so far obtained remain 
valid for such z. The answer is in the affirmative for those identities in 
which figure only either/ + or/ - ; when both occur then, clearly, we must 
have 3z = 0, that is, z = « C These assertions result at once from 
the unicity lemma 15.2d, which yields (i) and (ii) below, while for (iii) 
we also use the fact that all the r.v/s therein are nonnegative. 
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E. Extended identities. The following identities are valid: 

(i) For 3z ^ 0, 

1 - E (re itS r) =/+( Z ,/), 

co 

Z) i n E(e USn : r„ = ») = /+(z, /), 

oo 

t n E(e izSn : v n — n) =/+(z, /)• 

7l™0 

(ii) For 3z ^ 0, 

£ rE(e^) =/_(z, /) 

n—0 

H t n E(e izSn : T n = 0) = /-(z, /) 

n—0 
co 

Z p n = 0) = /_(z, /) 

n—0 

(iii) For 3z g 0, 3z' g 0 ， 

£ (-Wn-Sn)) 

n—0 

=exp|E ^CE(， + ) + E{e^ ，s -~) - l| 
and, in particular^ for 3z ^ 0, 

23 t n E{e izMn ) = exp< J1 — E \ 

n—0 I »—1 Yl j 

Remark. In fact, the argument used for the unicity lemma 15.2d 
permits to prove simultaneously identities and a unique factorization 
theorem (Pollaczec, Ray, Kemperman). To fix the ideas, replace u by z 
in P(u, t) and 这 (《, /) used in the proof of C : 

OO 00 

尸 (z,/) = H i n E(e izSn : r„ = n), Q{z t t) = Y, t n E{e iiSn : r„ = 0) 

n_0 n—0 

Note that P{z, t) like/+(z, /) ( 这 (z, /) like/_(z, /)) is bounded and continu¬ 
ous for 3z ^ 0 (3z ^ 0) and regular for 5z > 0 (3z < 0) while for 3z = 0 

/+( z > = j _^"(z) = ，念 . 
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Therefore, for 3z = 0, 


g{ ) 一 f + M 一 Q(z,i) 

where the first (second) ratio is bounded and continuous for 3z ^ 0 
(3z ^ 0) and regular for 5z > 0 (3z < 0). Thus, the two ratios are re¬ 
strictions of a same bounded entire function g(z) to 3z ^ 0 and to 
3z ^ 0, respectively. By Liouville’s theorem, g(z) is a constant. But 


so that 


P(z,i) 
f+ (z, t) 


as z —>• + 00 


P(z,/) = /+(z, /) for3z ^ 0, Q{zyt) = /_(z, /) for3z ^ 0. 

This proves the corresponding extended identities together with unique 
factorization. 


All preceding identities in/+ and/_ which are in terms of exponentials, 
naturally, are called exponential identities. Their striking and unex¬ 
pected feature is that the distributions of various fluctuation random 
variables are in terms of individual terms S n of the random walk. The 
sample space factorizations 

[r„ ( 不， ... ， X0 = k] = [T k {X,；- - - } X k ) = k][r n - k {X k+ii --., 义 „) = 0] 

and the equivalent one with r replaced by v are, naturally, called extreme 
factorizations. Their striking and unexpected feature is that the dis¬ 
tributions of r„ and of v n are determined by the pr.’s of their extreme 
values 0 and rt. 


26.4 Fluctuations; asymptotic behaviour. We relate now the asymp¬ 
totic behaviour of the random walk to that of fluctuations r.v.’s r A , r n , 
v n , M n \ A denotes a linear Borel set. 

a. Hitting time lemma. If t a is hitting time of A then 


00 /n 


(i) 

1 一 . 

(ii) 

P(r A 

(Hi) 

Er A 


ErA = exp|- 2 ：^ P(SnC 听 
P(t a = 00) = exp{ —E P(^n C ^)/» 


rt —1 


Et a = exp《E P(S n c A c )/n \ + oo - P(r = 00)(00 • 0 = 0). 
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Proof. We use the elementary proposition: If the a n ^ 0 and [ a n t n 

n*0 

00 00 

converges for 0 < / < 1， then ^ -*Y, a n ^ « as / t 1- 

n™0 

Set r = t a . Identity (i) results from the first one in 26.3B(ii) with 
u = 0. Identity (ii) follows from (i) by letting / | 1 in 
00 00 

Et r = X) / n P(r = ») —» 53 P( T = ») = P(t < °o) 

n»l n—1 

so that 

P(t = «) <— 1 — —exp (- E P(S n C ^)fn 

\ n—1 

Identity (iii) results from the second one in 26.3B(ii) with « = 0, by 
letting / I 1 so that 

exp(£ P(S n C A c )/n\ <-/£” = £ t n P(r > ») — £ 作 > ») 

) n—0 n—0 n—0 

and 

co 

Et = "^2 P(t > ») + 00 • P(t =00). 

n—0 

b. Finite interval lemma. Let ^ be a finite interval and let r be the 
hitting time of J c . Then Et t < <» for r > 0, and ES r = Er • EX exists 
{and is finite) if and only if EX exists and is finite. 

The first assertion is Stein’s lemma and the second one is Wald’s rela¬ 
tion, both obtained before general fluctuation theory. 

Proof. The second relation was proved in 26.1 and it remains to prove 
the first one. To fix the ideas, let J = [a,》].Since the only asymptotic 
alternatives are: a.s. S n - > — oo or to + <» or — = liminf S n ■< limsup 

S n = + 00 , there is an integer m such that p(\ S m \ ^ b — a) < 1. But 
[t > n m] implies occurrence of independent events [r > n\ and 
[| S n+m - S n \ ^ b - a], where - S n has the same distribution 
as S m . Therefore, 



P(t > n + m) ^ pP{r > n) 
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and, by induction, 

P(t > km) ^ p k ,k = 1, 2, • • • . 

00 

Therefore, P(r > n) < (p 1/ro ) n so that the series Y. t n P(j > «) < °° for 

n—0 

M < to = p~ llm > L The first assertion follows. 

A. Translation invariance theorem. 

00 

(i) If] is a finite interval then ^ P(S n C J)/» < 00 . 

n*l 

(ii) £ P(S n = x)/n < co,£ P(S n < x)/n + £ P(S n > X )/n = 

n—1 n—1 n*l 

for all x C. R- 

(iii) Either P(S n ■< x)/n < <» for all x & R 

Or P(S n < x)/n = <x> for all x C R, 

where “■<” stands for any one of the following inequality signs: 

=y ^ y = • 

(iv) If t x = t Ax is the hitting time of A x where A x stands for any one of 
the following intervals: (x } °°) ， [x, °°)，（一 00 ， x ]，（一 00 ， x), then 

P(t x < oo) = P(r 0 < «0 

for all x G R. 

CO 

In particular ， P(t x <«>) = \ if and only if ^ P(S n C ^o)/n = 00 . 

n_l 

Proof. Assertion (i) results from a(iii) and b(i). Assertion (ii) results 

00 

from (i) and the fact that the sum of the three series in (ii) is D l/» = 00 . 

n—1 

Assertion (iii) for, say, and ^ > 0, results from [0, <») = [0, x) + 

k 00 ) by 

£ P{S n > 0)/» = £ P(S n C [0,x))/n + Z P( s n ^ ^)/n 

n—1 n—1 n—1 

where, by (i), the second series converges; similarly for the other choices 
of and x ^ R. Finally, assertion (iv) follows, by a(ii), from (iii). 
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B. Three alternatives criteria. 

(ai). The following properties are equivalent: 

00 

—>• — 00 a.s. } P(r (0 ,oo) < °°) < 1, S P(S n > 0)/n < EX < 0 

n—1 

when EX exists. 

(a 2 ). The following properties are equivalent: 

00 

— + 00 a.S.y P(r(-co t 0) < °°) < 1， E < 0)/« < COy EX > 0 

n—1 

when EX exists. 

(a 3 ). The following properties are equivalent: 

— oo = liminf S h < limsup = +oo a.s. t P(r (0 , < «) = 1 and 

CO CO 

P(r(_oo, 0 ) < °°) = 1, s P{S n > 0)/» = 00 and Y, P( S n < 0) = «, 

n—1 n—1 

EX = 0 when EX exists. 

Proof. Assertions in (a 3 ) follow upon excluding the only two other 
alternatives (ai) ancj (a 2 ). Assertions in (a 2 ) result from those in (ai) by 
changing X into —X hence every S n into — S n . Thus, it suffices to prove 
those in (ai). 

If P(S n —> — oo) = 1, we cannot have P(r (0 ,oo) < °°) = 1 for then, 
by A(iv), P(t ( * ， oo) < oo) = 1 for x as large as we wish hence 
limsup S n = ^ a.s. Thus P(r( 0 ,co) < °°) < 1, by a(ii), is equivalent to 
00 

[ P(S n > 0)/», and the first three properties in (ai) are equivalent. 

n—1 

Finally, by 26.2C Corollary, when EX exists then = EX < 0. 

The proof is terminated. 

Corollary. P(limsup *S'n = + oo )=0 or 1 according as (i) 
P(r (0 ,oo) < «) < 1 or = 1, (ii) P(S n > 0)/n = « < «, (iii) EX < 0 

or EX ^ 0 when EX exists. 

1 < ! 

C. Asymptotic behaviour theorem. 

(i) If P(S n > 0)/n - oo then M n ^ r n ^ oo. 

(ii) If P(S n < 0)/n < oo then a.s. M n ^ M m with U. ch.f. 
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Ee iuM <- = expJ 53 {Ee iuSn+ — \)/n L 


v n ^ Voo, r„ ^ To, with common generating function 

f 00 、 

Et Va, = Et Ta> = exp< 23 (/ n — 1)^(*9„ > 0)/» 


Note that the hypotheses in (i) and (ii) being contrary of each other, 
are equivalent to their conclusions. 


Proof. By the above Corollary 23 P(S n > 0)/» = <» is equivalent to 

n—1 

limsup *?„ = + 00 a.s. hence M m = sup S n + ^ limsup *?„ = + 00 a.s. It 
follows that n 

P(v n +i =〜+ 1 i.o.) = P(r„ = n i.o.) = 1 

so that 00 and r n ^ Assertions (i) are proved. 


By the same Corollary, P(S n > 0)/» < <» is equivalent to limsup 

n—1 

S n < ao a.s., in fact, to lim S„ = — ao a.s. But, by definition of limsup, 
limsup S n < co a.s. implies M n t < 00 a.s. and P(^ n +i v n i.o.)= 

P(r n +i T n i.o.) = 0 hence < 00 and r n To, < . 

We use now the classical Abel theorem: If the complex a n — a finite 

CO 

then (1 — /) 23 a n t n —> « as / | 1. 

n—1 

Since M n t < 00 a.s., Ee iuMn —> Ee iuMa hence, by Pollaczec-Spitzer 
identity, as / | 1, 


* co 


(!-/)£ t n Ee iuM - 


t a r 

exp< — E t n /n >exp< X) i n E(e iuSn ^)/n 


exp<JX) t n {Ee {uSn+ — l)/»>^exp<53 (Ee iuSn+ — \)/n 
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The last limit is an i.d. ch.f., since it is a product of i.d. ch.f.’s with 

^n(u) = — I (e iux — 1) dFs +Cv). 
n J n 

The first assertion in (ii) is proved. 

Since P(v n = k) = P(j n = k) for k = 0, 1, . . . ， 》and n = 1 ， 2 , ... ， 
v n a ^ v and r n a -^ Too < 00 imply that P{v m = k) = P(T m = k) fork = 0, 
1, • • •. Thus, to find the generating function of Too it suffices to find that 
00 

of Poo: Et Va > = 23 t k P{v a, = k). Since < 00 , it follows, by ex- 

A—0 

treme factorization, that 

P(v m = k) <r~P(v n = k) = P(v h = = 0) —*P{y k = k)P(v m = 0). 

But, by a(ii), 

P(voo = 0) = P(r ( o,oo) = °°) = exp/ —J] P(S n > 0)/n\ 


while, by 26.3C(ii) and the second relation in (i) therein with « = 0, 

X ^P{n = k) = exp /53 t n P(S n > 0)/»V 

Jfe -0 i.n-1 J 

Therefore, 


00 / 00 \ 

Et v <° — ^ t k P{vk — k)P{v m — 0) — exp< 21 (/ n 1) 尸 > 0)/n >, 

A; 期 0 l n—1 J 


and the proof is terminated. 

This basic Spitzer theorem has the same striking and unexpected fea¬ 
ture as the exponential identities: The limit distributions are in terms 
of individual sums S n . 


COMPLEMENTS AND DETAILS 

As throughout this chapter, Xi f X 2f … are iid summands with common non¬ 
degenerate law£(^), d.f. F f ch.f./, and A = 0, *S* n = 不 + • • • + Slowly 
varying functions will be denoted by h{x) with or without affixes. 

/. Let Fk f 是 =1, 2, be d.f.’s and let 》一 > ①. 

If 1 — Fk(x) ~ x^ a hk{x) then 1 — {F^F^) (x) ^ x~ a (hi(x) + h^ix)). 

If 1 — 〜 x~ a h(x) then 1 — F Sn (x) 〜 nx^ a h{x). Deduce similar propo¬ 

sitions for Fk(^x) y F(—x). 
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2. Extrema. Let Y n = max 

l^k^n 

If P{X < r) = 1 for some constant c f then £(Y n ) —> £(c). 

If P(X < x) < 1 for every x 〔 R then there exist scale factors b n > 0 such 
that £(Y n /^n) —> £(y) nondegenerate if and only if 1 — F(x) varies regularly 
with exponent a < 0, and then Fy(x) = 0 or e^ cxa with c > 0 9 according as 
•v < 0 or *v > 0. What about Z n = min 

n 

3. Let S be a (X,/)-compound Poisson: fs = Let — 

If 1 — F{x) ~ x^ a h{x) then 1 — F s (x) 〜 \x^ a h(x). Is there a similar propo¬ 
sition about F(—x) and Fs(—x)i 

, 4. Let F be an i.d. d.f. with/ = 〆 ，屮 =(a, 卢 2 , L). Let x <». 

If L{x) = x~~ a h{x) then 1 — F(x) ^ L(x). Is there a similar proposition 
about L(—x) and F(—x) ? 

S. Norming. Let £(-ST) be attracted by a nondegenerate stable £ Y , 0 < y ^ 2, 
that is, & {S n /b n — ^ n ) —> <C r for suitable b n > 0 and a n . 

(a) Let M 2 (/) = J x 2 dF{x) and q{t) = 1 一 F{x) 4 - F (- x). Let/—> oo and 


use 25.I.D. 

l(r< y then ^-r-r I 丨 X YdF{x)^- 
，、，- ^1*1 <t y 

1( r > y when y < 2 then I I x | r < 

E\ X\ r < oo for r < t and E\ X\ r = oo for 


I x \ r dF{x) 


t r q{t). Deduce that 




we can take a n = EX: Use (a). 

(c) Scale factors. All suitable scale factors b n are of the form b n = n x ^h{ri)\ 
Use |/ n (tt/^ n )| = r c/u "(l +0(1)), replace n by nk then \/b nk by {bjb n k)/b n ^ 
note that o(l) 0 uniformly in every given finite interval, show that if the se¬ 
quence (in/ink) is not bounded then e~~ ck = 1 一 impossible, and finally b n k/b n —> 
k^y. 


6. Standard domains of attraction. We say that £(X) belongs to the standard 
domain of attraction of a nondegenerate stable £ y if b n = in lly > 0 are suitable 
scale factors. (The usual but confusing term is “normal” not ‘‘standard .’’） 

£(X) belongs to the standard domain of attraction of a nondegenerate stable 
£, y with 0 < r < 2, if and only if, as x — ① , x y (l — F{x )) — b^cp and x y F{—x) 
—> IfUq、c > 0 f p 9 q ^ 0. 

£(X) belongs to the stanjdard domain of attraction of 31(0,1) if and only if 
EX 2 < oo, and then b n = <r» 1/2 with <r = <rX. • 

7. Estimates for E\Sr\» Let £(^0 with EX = 0 belong to the standard domain 
of attraction of £ y (Y) with 1 < r < 2. 

(a) £(S n /n lf y)' ^ £ y (Y) f F(-x) ^ cx 一 i and 1 - F(x) ^ cx -、 for some con¬ 


stant c > 0. . 

(b) There is a positive a independent of n such that for x ^ Xo independent of 

P(\S n \/n l/ y > x) ^ a/x 2 . • • ,, 

(c) For 0 ^ r < t there is a positive b = b{f) independent of n such that 

E{\S n /n^Y) i. 
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(d) E(S n /n^y) ^ EY and E{\SJn^\^ E\YV for 0 ^ r < y. 

S. Partial attraction. £>(X) is said to belong to the domain of partial attraction 
of a nondegenerate £(Y) if there is a subsequence (k n ) of integers such that, for 
suitable b n > 0 and a nf £(Skjbn — a n ) —^ £(y). It is a property of types of 
laws. Discuss the propositions below in whichever order is preferred. 

(a) Every £(^0 belongs to the domain of partial attraction of either no type 
or of one type or of an uncountable family of types. 

(b) If* £(-30 belongs to the domain of partial attraction of only one type, then 
this type is stable. 

(c) A symmetric distribution with slowly varying two-sided tail belongs to no 
domain of partial attraction. 

(d) If/ belongs to the domain of attraction of an i.d. so does the i.d. 〆 一 1 . 

An i.d. law need not belong to its own domain of partial attraction ： Use the first 
statement and (c). ( 

(e) If is partially attracted by £(Y) which is partially attracted by £(Z) 
then JB(vY) is partially attracted by £(Z). The domain of partial attraction of a 
stable law is strictly larger than its domain of attraction. 

OO 

(0 Let i.d. f n = e^ n have bounded Set rp n (i n u)/k n . There are 

» 画 i 

b n > 0 and integers k n — ① such that k n 4>{u/b n ) — ^ n («) —> 0, tt C 兄 

(g) If/is partially attracted by i.d. e* n — e* then it is partially attracted by 
Is i.d. property of the e*% e* needed? 

(h) Every i.d. f = has a nonempty partial domain of attraction: Note 
that there are compound Poisson e^ n —>/, and use lim e kn ^ u(an) = lim 

n 

oo 

(i) Levy example: f = with ip(u) = 2 2^ k (cos2 k u — 1) is i.d. Find its 

fg M — °0 

Levy function. Show that f 2n (u) = /(2 n «)；/ is not stable but partially attracts 
itself- ' 

(j) Every sequence of i.d. laws has an i.d. law belonging to the domain of par¬ 
tial attraction of each of its terms. 

(k) Doblin universal laws. There are i.d. laws belonging to the domain of par¬ 

tial attraction of every i.d. law. Consider the countably many i.d. laws ― ordered 
into a sequence , whose Levy functions are purely discontinuous 

with only rational discontinuities and only rational jumps, every i.d. is limit 
of a subsequence of (〆”)，and use (j). 

9. Consider random walk on lattices with, to simplify, span 1. 

(a) Such a random walk forms a constant Markov chain with a countable 
number of states. What is its transition matrix? 

(b) Interpret the concepts and results in the Introductory Part III in terms 
of those in §26. 

(c) Discuss the Introductory Part CDIII in terms of §26 and complete it. 

10. (a) A truly two-dimensional random walk with zero expectations and 
finite variances is recurrent. 

(b) A truly three-dimensional random walk is always transient. What about 
w-dimensional random walks with m > 3? ‘ , 

For (a) and (b),use ch.f/s analogously to the one-dimensional case in 26.2. 
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//• ES T . Let EX = 0 and <r 2 = ar 2 X oo). 

(a) E\S n \/n^ ^_a for some constant a and all n. In fact, E\S n \/n m = 
2ESt/n^^<rV2/r. 

(b) Let A = (一 oo, 0】 or (0,<»)• 

<r 2 < oo = ES 、and are both finite, and then 

es ^ = exp - p(Sn c A - 

12. Arcsine law. (a) Complete the computations in the proof of Arcsine law 
in 26.1. 

(b) Let f ^ — P(<y n > 0)) be finite. Then 

P{v n , = 0) 〜 f/V^Trw, P{v n = ») = e^ c V2irn 
and the Arcsine law holds. 

(c) Andersen and Spitzer generalizations. Let a n = P(S n < 0). 

(々 + _•• + a n )/n-^ a<=>£(l — v n /n) -^£(Y) 
with £(y) = £(1) for a = 0 f <C(Y) = £(0) for a = 1 and, for 0 < ^ < 1, 

P(Y <y) = I ^-«(1 - xY^dx- 

a Jq 

if (^! + • • • + a》In does not converge then £(1 — v n /n) does not converge. 
(Andersen case: a n a.) If a = 1/2, £(Y) is Arcsine law. 

Use Kemperman’s recurrence relation : Let b n {k) = E{n — y n ) k ; 彡 n (0) = 1 ， 
^n(l) == w — (A + • • • + a n ), b^k) = 0 for k = 1,2, • • • • Then 

»—i 

b n {k + 1) = nb n {k) — D a n -m^m(k). 

m—0 

When 0^ + • • • + a^)/n —> a then (1 — v n f n) {Y) with EY k = (1 — a) 

(1 — a/2) •••(!— a/k); apply Ch. IV,CD/0 (Spitzer). 

13. Identities and limit distributions. Let v n ,be respectively the 
number of positive nonnegative, negative, nonpositive sums in (<S*o，• • • ， S n ). 
Let r n , T’n, r n , r’n be respectively the time of occurrence of the first maximum M», 
the last maximum M nf the first minimum M nf the last minimum M n of 

(So, •••，&)• . r , . c 

(a) The equivalence relation P{v n = 是） = P{r n = k) remains valid ir same 
affixes above are added simultaneously to v and to r; similarly for E(e luSn : v n = 
k) = E( iuSn : r n = k) and, more generally, for E(f n : v n — k) = E(f n ： r n = k) in 
26.1. What about extreme factorizations? 

(b) Which exponential identities in 26.3 and results in 26.4 remain valid or 
have to be modified accordingly when the same affixes are added ? 






[Sec. 26] INDEPENDENT IDENTICALLY DISTRIBUTED SUMMANDS 405 


14. Ranked sums (order statistics). Order the sums as follows: *S\(w) precedes 
S /(oi) if Si(cj) < Sj(cj) or Si(cj) = Sj(w) but i < j. For every 是 = 0, •••,», let 
^njfe(w) be the 是 th from the bottom of So(u)) f • • . , S n (o>) according to this order¬ 
ing, Let rnkM be the index of corresponding that is, RnkM : = ^ /(w) 

rn*(w) = Sj(uy). Note that R n o ^ ^ R nni Rno = M n is the first minimum 

occurring at time r n o = r n and R nn = M n is the last maximum occurring at time 

V nn = V’ n* 

Discuss the following Wendel identities: 

Es 9 ^ = Es 9jth • 

= Ee ivMh • Ee ivMn ^ k y 

^ e iuS n +ivR n kj = 
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continuity, 77 
convergence, 114 
convergence theorem, 204 
Union, 4, 56 
Upper 
class, 272 
variation, 87 
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Uspensky, 407 

Value(s), possible, 370 
theorem, 371 

Variable, random, 69, 17, 152 
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Variances, bounded, 302 
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Variation 
lower, 87 
regular, 354 
slow, 354 
total, 87 
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upper, 87 

Vector, random, 152, 155 
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convergence, 180 
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